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(Professor Artin wants us to call him Mike.) 
First of all, it’s good for us to read through the syllabus. On the back page, there is a diagnostic problem that is 
pretty simple; we should have it done by Monday. It will not count for our grade, but it is required. Apparently the 


problem sets are hard: we should not expect to finish them quickly. 


Fact 1 


The two main topics of this semester will be group theory and linear algebra. 


When Professor Artin was young, he wanted to learn the “general axioms.” But it’s better to use examples and 


use those to understand the axioms when we're first learning mathematics. 


Definition 2 
The general linear group, denoted GL,, consists of the invertible n x n matrices with the operation of matrix 


multiplication. 


(The definition of a group was given on the next day.) To make examples easier to write down, we'll take n = 2. 


Matrix multiplication looks like the following: 


1 2] {1 5] [1-14+2-4 1-5+2-1] [9 7 
S$ aa w) IScteaed See aei| io. 16 


The definition of matrix multiplication seems kind of complicated, but it turns out we can come up with a natural 


explanation. One way to explain this definition is by looking at column vectors, with matrices “acting from the left 
side:” if V is the space of 2-dimensional column vectors, we can treat our matrix A as a linear operator on V, where 


a vector v € V gets sent to Av. 


Fact 3 


it 0) 
Given two matrices A and B, it is generally not true that AB = BA. (For example, take the matrices | and 


0 


: they do not commute.) However, matrix multiplication is associative (that is, (AB)C = A(BC)), and 


we know this because we’re just composing three transformations in the same order: C, then B, then A. 


In this class, we'll generally deal with invertible matrices (because they make our group operations nicer). By the 


way, if we don’t know this already, the inverse of a 2 by 2 matrix Is 


Definition 4 


An element A of a group has order n if A” Is the identity element. 


Example 5 


il 
Consider the matrix A = 


il 
‘| Since A® = I, Ais an element of GL» with order 6. 


Elements can have infinite order as well, but it turns out the space of 2 x 2 matrices Is nice: 


Theorem 6 


lf entries of a 2 x 2 matrix A are rational, and the order is finite, it must be 1, 2, 3, 4, or 6. 


(We'll prove this much later on.) Professor Artin likes to use dots instead of zeros in matrices because they look 


cleaner, but | will not do this in the notes. 


Example 7 


The following matrix just cycles the indices of a vector, so it has order n if it is n-dimensional. 


There are three kinds of elementary matrices, which basically change the identity matrix by a tiny bit. (This is 


the idea of row-reducing.) We have the matrices 


bl Ey 


which add a times the second row to the first row and vice versa, the matrices 


EI 


which multiplies one of the two rows by c, and the matrix 


a 


which swaps the two rows. 


Theorem 8 


The elementary matrices generate GL>. In other words, every A in GL» Is a product of the above elementary 


matrices. 


Let's say we start with an arbitrary matrix, say 


wkd 


It's hard to randomly find matrices E;, Eo,--- that multiply to M. Instead, we should work backwards, and try to 
write 
Ex,-++ EsEyA=I!. 


Then we know that A = E;*E5'---E&;", since all the elementary matrices have elementary matrix inverses. I’m not 
going to include how to do this here, but we basically work one column at a time and try to get the matrix to the 


identity. 
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Definition 9 


Given two sets S and 7, the product set or Cartesian product S x T is defined to be the set of all ordered pairs 


{s,t}, wheres Ee S,teT. 


Now we want to define (essentially) a binary operation: 


Definition 10 
A law of composition on S is a law S x S 35 sending (a,b) — c (for a,b,c € S). We often refer to c as 


ax b,ab,a+ b, or something else of that nature. 


Professor Artin forgot to define a group last time, so we'll actually do that now: 


Definition 11 


A group G ts a set with a law of composition that also satisfies the following axioms: 
- Associativity holds; that is, (a* b) x c = ax(bxc) for all a,b,c € G. 
« There exists an identity / € G such that for alla€ G, aki =ixa=a. 


« Inverses exist; that is, for all a € G, there exists an a € G such that aa =aa=/. 


From now on, we'll refer to the group law as ax b = ab, the identity element as 1, and the inverse of a as a7!. 


Example 12 


GL,» is a group, with group law (or binary operation) being matrix multiplication. 


We should keep in mind that the identity is not always the number 1; for example, the identity for GL, is the 


identity matrix /. (This is just symbolic notation. ) 


Definition 13 


A permutation of a set S is a bijective map S  S. (Importantly, S does not need to be finite.) 


Recall that a map S - S is called injective if p(s,) = p(s) implies s; = sp (it is one-to-one) and surjective if for 


all t € S, there exists s € S such that p(s) = t. Then a bijective map is one that is both injective and surjective. 


Lemma 14 
A map p is bijective if and only if it has an inverse function m (in other words, par = /, which happens if p is 


surjective, and mp = /, which happens if p is injective). 


Lemma 15 
If S is a finite set, then a map p: S > S is injective if and only if it is surjective. Both of these are equivalent to 


p being bijective. 


Basically, all of these just mean we will “fill out” the whole set with p and not leave out any elements. 


Definition 16 
Let the set of permutations of S be Perm(S), and define a law of composition to be the composition of the two 


permutations. Then the symmetric group, denoted S,, is the set of permutations Perm(1,--- , 7). 


The first interesting example happens when we take n = 3 (mostly because it’s the first time we have a nonabelian 


group, meaning not all elements commute). There are 3! = 6 total permutations in S3. 


Example 17 
Consider the two permutations p, gq € S3 such that p sends 1 to 2, 2 to 3, and 3 to 1, while q keeps 1 fixed but 


swaps 2 and 3. 


Now we want to try composing these together. To do this, just look at where each element goes individually; pq 


corresponds to doing g, then p, so 1 goes to 1 and then 2. We find in the end that pq sends (1, 2,3) to (2,1, 3). 


Example 18 
Suppose p € Ss sends (1, 2,3, 4,5) to (3,5,4,1, 2). Is there a more concise way to write this permutation? 


One good question to ask here: what does this do visually if we draw arrows between our numbers 1, 2,3, 4,5? 
1 gets sent to 3, which gets sent to 4, which gets sent back to 1. This is a 3-cycle (because it has three elements). 
Meanwhile, 2 and 5 are part of their own 2-cycle or transposition. Thus, we can write this permutation in the cycle 
notation (134)(25). 


Fact 19 


Any permutation can be written as disjoint cycles; just start from any number and see where it goes. 


Example 20 


Take p € Ss as above, and let q have cycle notation (12)(34)(5). 


We can then find (pq) by considering one number at a time and follow the cycles. The result is written below: 
pq = [(134)(25)][(12)(34)(5)] = [(1523)(4)]. 


Essentially, 1 goes to 2 under gq and then 5 under p, so 1 goes to 5. Then we try to see where 5 goes, and repeat until 
we've covered all of the numbers. (It’s important to remember that we do the action of q before the action of p.) 

There is one problem: (134) and (341) are actually the same cycle. Cycle notation isn’t unique! (For now, that 
really doesn't matter, though.) For convenience and by convention, we will also avoid writing the fixed points from 
now on. (So if we see an index that doesn’t appear, it just goes back to itself in our permutation.) 


One last example: 


Example 21 


Let r = (12345). What is rp? What is r~!? What is rpr71? 


All of these are fairly straightforward calculations or observations. First of all, 
rp = [(12345)][(134)(25)] = [(142)(35)]. 
Similarly, the inverse of a cycle can be found by traversing it in reverse: 


r~* = [(12345)]~* = [(15432)]. 


a 


Finally, rpr~- is a bit more important in our study: 


rpr—' = (rp)r~* = [(142)(35)][(15432)] = [(13)(245)]. 


This last operation is called conjugation, and it is important (we will see later in the class that conjugate permutations 
have the same cycle type). 
Now, we'll talk a bit about permutation matrices: these are ways to assign matrices to permutations. Specifically, 


P operates on a column vector (which contains the elements of S) by applying p to it. 


Example 22 


X1 


Let p = (123) € Sz, and let’s say our column vector is x = | x9|. Then we define a matrix P associated to the 


x3 


permutation p such that Px = 


Notice that this is permuting the entries, so x, ends up where x Is, and so on. In particular, this is the inverse of 
the actual permutation p. 
An important idea from linear algebra is that of a basis — let’s define one for our matrices. Let the matrix units 


ej; be defined to have an entry 1 in row /, column J, and Os everywhere else. Then if A = (aj), then we can just write 


A=) ajejj (entry by entry). 
Our permutation matrix P can then be written as 


on > Epp. 
J 


(again, notice this is the inverse of the permutation p, since p; corresponds to / instead of j corresponding to pj). 
What if we want to compose these matrices? Well, notice that ejex; is ej if j = k and zero otherwise. So given 


two permutation matrices P,Q, 
pou ey bs cnt) 
j k 


the terms become zero unless / = q,, and we are left with 


PQ= S Cpa. dk &dk.k = S Cpax.k 
k k 


which is what we want: multiplying matrices gives us our composition. 

Let’s now return back to the permutation group S3: we'll try to generate our whole group with p = (123) and 
q = (23). Then pg = (12), p? = (132) = p7}, and p? = 1. Similarly, we can find that q? = 1 and p?q = qp = (13): 
this is all the elements of S3. So now we have a way to describe our group law: 


Fact 23 


S3 is generated by two elements x, y, such that x? = y? = 1 and yx = x?y. 


Now any element can be written in the form x?y?, where 0 < a< 2and0< b< 1. We use the yx = x’y part to 
turn any product of xs and ys to move all the xs to the front and all the ys to the back, and then reduce mod 3 and 


2, respectively. This is exactly the set of 6 permutations that we want! 
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Recall that a group's law of composition must be associative, have an identity, and have an inverse. 


Definition 24 
A subgroup H of a group G is some subset of the group with the same law of composition. H must be closed; 
that is, if a,b € H, then ab € H. In addition, 1g, the identity in G, must also be in H, and all inverses a7! of 


a €H must also be in H. 


Here are a few simple examples of subgroups: 


Example 25 
The set of positive reals is a subgroup of R \ {0} (the set of nonzero real numbers), with multiplication as the 


law of composition. 


Example 26 


The special linear group SL, (the set of matrices in GL, of determinant 1) is a subgroup of GL,, because 


determinant is multiplicative. 


There are lots and lots of subgroups of GL,, and that’s why it’s important! We don’t have enough theory to 
describe them all, though. 
Instead, we'll try talking about subgroups of Z*, the integers under addition. For a subset of the integers to be a 


subgroup, it must be closed under addition, it must contain 0, the additive inverse, and if a € H, then —a € H. 


Fact 27 
nZ = {--- ,—2n,—n,0,n,2n,---}, that is, the multiples of n, is a subgroup. 


Proposition 28 


All subgroups of Z* are of the form nZ for some integer n. 


Proof. Let H be a subgroup of Z*. We know 0 € H; if H is just that, we're done (and this is OZ). Otherwise, let 
a #0 be some element. Then —a € H, and one of these must be positive, so H contains at least one positive integer. 
Let n be the smallest such positive integer in H. 

We claim now that H = nZ. Since n € H, 2n € H, and so on, and same with the negative multiples of n. Thus 
nZ C H, and now we just want to show H C nZ. Take some b € H. By the division algorithm, we can write b = ng+r 


for g€Z,0<r<n. Since be H,nqgeH, r=b—ngeH. But 0 <r<_n, and n was defined to be the smallest 


positive element, so r= 0. Thus b must have been a multiple of n, so H € nZ, concluding the proof. 


Corollary 29 (Bezout’s Lemma) 


If d is the greatest common divisor of a, b, then it can be written as d= ra+ sb. 


Proof. Let a,b € N. Then aZ+ bZ is a subgroup of Z under addition (it is easy to check closure, identity, and 
inverses). Thus it is of the form dZ for some d € N. This means aZ+ bZ € dZ, so a,b € dZ, so d divides both a 
and b. But we also know that dZ € aZ+ bZ, so we can write d= ra+sb for some r,s € Z. 


On the other hand, let a,b € N. Then aZm bZ is the set of integers divisible by both a and b. The intersection of 
two subgroups is also a subgroup, so this set is mZ for some m. So if a|x, b|x, then mlx, and if mly, then aly, bly. 


Taking y = m, alm and b|m, and we want m to be the least common multiple of a and b. 


Theorem 30 
Let a, b be positive integers. If d is their greatest common divisor and m is their least common multiple, then 
ab=dm. 


Definition 31 
A map @: G > G’ is an isomorphism if @ is a bijective map, and for all a,b € G, if 6(a) = a’ and ¢(b) = BP, 
then (ab) = a’b’. 


(One example is the identity map from G to itself.) 


Example 32 


S, is isomorphic to the set of permutation matrices. Also, the real numbers under addition map to the positive 


reals under multiplication, using the exponential map x > e*. 


Theorem 33 


Every group G of order |G| = n is isomorphic to a subgroup of Spy. 


Proof. Let g € G. Define a map mg : G — G to be multiplication by g; that is, it sends mg : x + gx. Then mg is a 
permutation of the elements in g (it is a bijective map, since inverses exist). Thus M = {m,|g € G} is a subgroup of 
all permutations of G, because mgmp(x) = ghx = Mgn(x). The identity permutation exists in M (it corresponds to the 


identity element in G), and Mga = ie so inverses exist too. Since G > M is a bijective map, it is an isomorphism, 


and thus G is isomorphic to a subgroup of Sp. 


Example 34 


Take S3 = {1,x,x?, y, xy, xy7}, where x = (123), y = (23). 


We can embed this into Sg as follows: assign the indices 1, 2, 3, 4, 5, 6 to the elements of S3 above. Then x 
is the permutation (123)(456), since it sends 1 (with index 1) to x (with index 2), and so on. y is the permutation 
(14)(26)(35), and now just compose those to get all other permutations. Thus, $3 is isomorphic to a subgroup of Se. 
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Recall that an isomorphism is a bijective map compatible with group laws. Specifically, it sends G > G’ in such a way 
that a— a’ and b— Db’ means ab > a’b’. There usually do exist isomorphisms from a group to itself that are not the 


identity though. (For example, negation in Z works.) We'll generalize this concept a bit: 


Definition 35 


A homomorphism of groups G — G’ that is compatible with group laws, but doesn’t necessarily have to be 


bijective. If we call our homomorphism ¢, we can write this as 


(ab) = $(a)(b)Va, bE G 


In other words, homomorphisms are similar to isomorphisms, but two elements of G can be sent to the same 


element of G’. 


Example 36 


The determinant map @: GL, *F RX isa homomorphism. 


In particular, det sends a matrix A to det A, and indeed AB — det(AB) = det Adet B, which means the map @ is 


compatible with group laws. 


Lemma 37 (Cancellation law) 


Let a, b,c be elements of a group. If ab = ac, then b= c. 


Proof. Multiply by a7! on the left side. 


Lemma 38 


Let 6: GG’ bea homomorphism. Then ¢(1g¢) = 1g, and ¢(a-+) = g(a) 1. 


Proof. For the first part, plug in a= b= 1 into the definition of a homomorphism. Since 1 = 1-1, (1) = (1) ¢(1), 
which means (1) = 1g by the cancellation law. For the second part, plug in b= a~!. Then $(1) = $(a)@(a+), and 


the left hand side is the identity (by the previous part). 


Example 39 
C+ + C% is a homomorphism under the homomorphism x > e27*. (This factor of 27/ in the numerator is nice 


because 1 gets sent to 1.) 


This is not bijective; in particular, if two complex numbers x and y differ by an integer, they are sent to the same 


element of C%. 


Example 40 


Every permutation has a sign, so we can send S, — {+1}, where we send to 1 if the sign is positive (an even 


permutation) and —1 if the sign is negative (an odd permutation). 


How do we define sign? p € S, corresponds to a permutation matrix. Then we just define the sign to be the 
determinant of P. Since permutations essentially swap rows of the identity matrix, and each swap multiplies the 


determinant by —1, the sign of a matrix is (—1)”, where n is the number of transpositions! 


Example 41 
Let G be any group, and pick some a € G. Then there exists a map Zt —> G that sends k > a*. 


k 


Of course, akt+! = aka!, so this is a homomorphism. 


Definition 42 
Let 6: G > G’ be a homomorphism. Then the image of ¢, denoted Im 4, is everything in G’ that is hit; that is, 
it is the set 


Im@={a 6G: AxeGst. ox) =a}. 


Proposition 43 


If @ is a homomorphism from G to G’, then Im@ is a subgroup of G’. 


Proof. \f a’, b' € Im@, then there exist x, y such that $(x) = a’, d(y) = b’. Then o(xy) = d(x) d(y) = a’b’, so a’b' is 
also in the image of ¢. Thus, Im@ has closure. The identity is clearly in Im @, since ¢(1g) = 1g. Finally, if a’ € Im(@), 


(x) =a! = > (x7!) = a1. Thus all group axioms are indeed satisfied. 


Definition 44 
Let ¢: G + G’ be a homomorphism. Then the kernel of ¢, denoted ker @, is the elements in G that go to the 
identity; that ts, 


ker@ = {x €G: o(x) = 1}. 


Proposition 45 


If @ is a homomorphism from G to G’, then ker @ is a subgroup of G. 


Proof. \f x,y € ker, then @(xy) = O(x)¢(y) = lglg = 1g. Like before, ¢(1¢) = le, and if g(a) = le, then 


o(at) = iz = 1g. Thus the kernel ker @ satisfies all group axioms. 


Proposition 46 (Extra property of the kernel) 
If x € ker@ and we have an arbitrary element g € G, then gxg™! (called the conjugate of x by g) € ker@. 


(Conjugate elements of x are in the kernel.) 


Proof. This is just a direct computation 


b(9x9*) = 6(9)6(x)¢(9*) = O(9)6(9-*) = Ie, 


so gxg ! € ker@. 


Definition 47 
A subgroup H of G is called normal if for all x € H, g € G, we have that gxg-! € H as well. 


For example, we just showed that the kernel is a normal subgroup. 


Example 48 


We continue some of the above examples: 


« In the determinant map GL, = R*, the kernel is SL,. 


+ In the exponential map Ct > C%, the kernel is Z. 
+ The kernel of the sign map for S, is called the alternating group Ap. 


+ The map that sends k — a* for an arbitrary a € G has kernel 0 or nZ, depending on the order of a in G. 


The last map in the example above has image {--- na “1 aeates -- }, where this set may or may not be infinite 
depending on whether the kernel is 0 or nZ. This is denoted (a), and it is called the subgroup generated by a. If a 
has order n, that means the group is {1, a,--- , a7}, and a? = 1. 

But if we have more than one element, things are harder. Sometimes we don’t even know how to write down the 


group generated by two elements a, b with some relations. 


Example 49 
We can construct a homomorphism from Sq, — S3. There are three ways to partition {1, 2,3, 4} into sets of 2 (it 


depends what 1 goes with). Call 2 the partition that puts {1,2} together, m2 the one that puts {1,3} together, 


and 73 the one that puts {1, 4} together. Then any permutation of the elements of S4 permutes 11, 72, 73, which 


corresponds to an element of S3! 


10 


For example, let p = (123)(4) € S4. Then 1, > 73, 12 > 11, 73 3 To. We can write this as 6 = (1171372). On 
the other hand, p = (1234) sends 1, > 13, T2 > To, 13 4 71, SO P = (11773) (T2). 
We can have fun checking that pq — fq, so this is indeed a homomorphism! It is also a surjective one, since p 


and q generate the whole group S3. One exercise for us: what is the kernel of this map? 
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Definition 50 
Let H be a subgroup of G, and let a€ G. Then a left coset C = aH Is the subset of G defined via 


C={ah:he H}. 


There are many different cosets for a single group H. 


Example 51 


Let G be the symmetric group S3 = {1,x, x, y, xy, x*y}. Let H = {1, xy} (notice that (xy)? = xyxy = x?y? 


iD, 


Then we have 
1H = xyH = {1,xy} 


xH = x*yH = {x, x2y} 
x?H = yH = {x?, y} 


In particular, there are six ways to get a coset, but there are only three cosets. 


Fact 52 


All cosets have the same order as H. 


This is because multiplying by some group element a is reversible; we can get backwards from aH to H by multiplying 


every element by a7. This is a bijective correspondence, so H and aH have the same order. 


Proposition 53 


The cosets partition the group and are disjoint (unless they refer to the same coset). 


First, we prove a lemma: 


Lemma 54 


Let C be a coset. If b€ C, then C = bH (in particular, all elements of G are in a coset of H). 


Proof. We know that C = aH for some a, so b= ah for some h € H. Thus BH = ahH = a(hH), and hH is just H. 
Thus we have aH = C. 

Except we used some symbolic notation above, so let’s write this out more rigorously. We wish to show that if 
b= ah for some h € H, then bH = aH. Take an element of bH; it looks like bh’ for some h € H. So bh’ = ahh’, and 


11 


since hh’ € H (since H is closed), bh’ € aH. Thus BH C aH. On the other hand, we also have bh! = a, and we can 
do the same reasoning to show that aH C bH; thus, aH = bH, and thus C = bH. The central idea here is just that 


we can tack on an for A! to convert between bH and aH. 


To finish proving the above proposition, we wish to show the cosets partition G. 


Proof. (First of all, all cosets are nonempty, since 1 € H, so a € aH. In addition, a € aH means every element is in a 
coset, so the cosets cover G.) 
Let Cy, Co be cosets that have nonempty intersection; we wish to show that Cy = Co. We can write Cy = 


a,H, Co = aoH for some aj, ao in our group. Then if b€ Cy MCs, then b € ajyH = > DH = ajH by the lemma. In 


addition, b€ aaH = DH = aoH. Thus ayH = a>H, so two different cosets cannot have nonempty intersection. 


Definition 55 


Let H be a subgroup of G. Then the index of H in G, denoted [G : H], is the number of (disjoint) cosets for H. 


We know that the size of each coset is the size of H. This yields the following result: 


Fact 56 (Counting) 
For any subgroup H of a group G, we have |G] = [G : H]|H|. 


Corollary 57 (Lagrange’s theorem) 
|H| divides |G| for any subgroup H. 


This has a nice corollary if |G| = p, where p is prime: 


Corollary 58 
For groups of prime order, the only subgroups are the whole group and the trivial subgroup (containing only the 
identity). In addition, the order of any non-identity element of G is p, since the subgroup generated by that 


element has more than 1 element. 


This means that every group of order p is a cyclic group! Just take some nonzero element a, and the group can 


be written as (1,a,--- ,a®~1). 


Corollary 59 


The order of any element a of a group divides |G]. 


Proof. The element a generates a subgroup whose order is equal to the order of that element, and then we can use 


Lagrange’s theorem to get the desired result. 


Next, let's try to look at groups with non-prime order. 


Example 60 


What are all the groups of order 4? 
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Elements must have order 1, 2, or 4. The only element with order 1 is the identity. If G contains an element g 
of order 4, then G is the cyclic group of order 4 (1, a, a, a}. This can be denoted Z/4Z, as the integers mod 4 with 
addition. Otherwise, all other elements have order 2. So G = {1,x,y,z| x? =y? =z? = 1}. 


Now either xy = 1, x, y,or z. But by cancellation, it can't be x or y. If xy =1, then xy = x? 


, SO X = y, which Is 
bad too (we assumed x, y, Z are distinct). So xy = z, and similarly, xz = y, yz = x. 

This gives the Klein four group, which is actually just Z/2Z x Z/2Z; that is, all ordered pairs of integers mod 2, 
with group operation addition. Thus, there are two groups of order 4. 

Now, let’s apply this idea to homomorphisms @ : G + G’. We know that the kernel K = {x € G | @(x) = 1’} is 


all sent to 1’. 


Fact 61 


Let a be an element of G, and let @ be a homomorphism from G to G’ with kernel K. Then the coset aK is sent 


to (a). 


(This can be verified by writing out (ak) = ¢(a) for any k € K.) In fact, this goes both ways: 


Lemma 62 
Let a,b€G. Then ¢(b) = ¢(a) if and only if b € ak, where K is the kernel of @. 


Proof. b € aK means that b = ak for some k € K. Then $(b) = $(ak) = ¢(a)¢(k) = ¢(a). On the other hand, 
$(b) = d(a) = (ba!) = $(b)b(a“!) = G(a)¢(a-+) = o(aa~+) = (1) = 1’. Thus ba? is in the kernel, so 
be ak. 


Corollary 63 


|G| = | Im @|| ker 6] for any homomorphism @: G > G’. 


For example, in the example above @ : S4 — S3, the image was the entire set S3, but there were 4 elements in the 
kernel (it is the identity, (12)(34), (13)(24), and (14)(23)). 24 = 6-4, and mathematics works. 
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Recall that last time, we had the “counting formula” |G| = | lm@|| ker é|. We'll see some quick applications of this 


here: 


Example 64 


The kernel of the sign permutation that sends S, — {+1} is the alternating group A,. It has order ue for n> 2, 


since the image has order 2. 
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Theorem 65 (Correspondence Theorem) 


Let G > G’ be a (surjective) homomorphism. (Surjectivity means we don't have to mention the image of G’ all 


the time.) Then there is a bijective correspondence between subgroups of G containing ker @ and subgroups of 
G’. 

Specifically, if H is a subgroup of G, then H corresponds to H’ = @H (symbolic notation; it’s everything we 
get when we apply ¢ to elements in H). On the other hand, if H’ is a subgroup of G, it corresponds to H = @- 1H! 
(again symbolic notation; the set of all x such that $(x) € H’). In addition, we have |H| = |H’|| ker @]. 


Sketch. First, let H be a subgroup of G containing K. We can show $H = H'’ is a subgroup of G’ by verifying the 
group axioms; this is not too hard. 

Now, let H’ be a subgroup of G’. We show that 71H’ is a subgroup of G containing the kernel. We know that 
the inverse image of the identity, which is contained in H’, is the kernel, so the kernel is contained in g~1H’. The 
identity is in the kernel, so we get that for free. Closure is not too hard either: if x, y € @ 1H’, then @(x), d(y) € H’, 
so O(x)o(y) = (xy) € Y’ = > xy € @ 1H’. Inverses can be checked pretty easily too. 

Finally, we show the correspondence: @~!@H = H if H contains the kernel ker ¢. We know that @-1@H D H for 
every map, so we just need to show that @ 16H CH. Let x € @ 1H; this means that @(x) € @H. Therefore, 
$(x) = o(y) for some y € H, so $(xy~+) = 1, so xy~? € kerb, so x € yker € OH, since H contains the kernel. 


Lastly, 6- 1H! = H! (which is true for any surjective map). 


This is pretty interesting when we apply it to a few examples. 


Example 66 

Let G=G’ =C%. Consider the map ¢ sending z + z?. (In general, this map is a homomorphism if and only if 
the group is abelian, since (xy)* = xyxy # x*y? unless xy = yx.) Then the kernel is the set of all z € C* such 
that 2 = 1; thus, ker @ = {—1, 1}. 


Now we can pick a subgroup and apply the correspondence theorem. The subgroup H, = R™ € G corresponds to 


the image H, of IR* under ¢, which is the group of positive real numbers under multiplication. On the other hand, 


Hs, = {+£1,+/} € G’, the set of fourth roots of unity, is a subgroup of G’, so it corresponds to the subgroup H> of 
complex numbers with square equal to the fourth roots of unity: that is, the eighth roots of unity. Notice that we 
indeed have |H2| = 8 = |H45|| ker o| = 4- 2. 


Example 67 
Let G = Sy, G’ = Sz, and let ¢ be the map defined on September 12. Let’s write out the correspondence 


explicitly. 


(Recall that 1, 72,73 are the three partitions of {1,2,3,4} into three pairs, and G’ tracks the permutation of 
{1, M2, 73} when G tracks the permutation of {1, 2, 3, 4}. In particular, the kernel of ¢ is {1, (12)(34), (13)(24), (14)(23)}.) 
What are the subgroups of 53? They have to have order 1 (trivial), 6 (the whole group), or 2 or 3. The latter two 


cases are prime order, so they must be cyclic; we can just look at each element and find its order. Write 


S3 = {1,x,x*, y, xy, x?y}: x = (m1 723), Y = (ToT) 


Then the four other subgroups are generated by y, xy and xy, each with order 2, and x with order 3. Thus, by 


the correspondence theorem, there are exactly six subgroups of S4 that contain the kernel ker ¢. They will have order 
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4, 8, 8, 8, 12, and 24. The smallest of these is ker ¢, the largest is S4,, and the one with order 12 is likely Aq, as long 
as ker @ is contained inside Aq (the answer turns out to be yes, since the orders of the nontrivial elements of the kernel 
are 2). 

How would we find the subgroups of order 8? We know that the permutation y € S3 has order 2. We've defined 
y to switch a> and 73, so we need to find some permutation not in the kernel that fixes 71 or switches a> and 73. 
It turns out (12) swaps 7 and 73, so (12) > y. Thus (12) € H,, and now we can get the rest of H; by taking the 


kernel and multiplying them by (12). In other words, we can write this as 
Hy = (ker d, (12)) = {ker d, (12), (34), (1324), (4231)}. 


Similarly, we can also find Ho, H3. 
Unfortunately, this correspondence theorem does not tell us about other subgroups that don’t contain the kernel. 
For example, there are many subgroups of S,4 with order 2 or 3 (transpositions and 3-cycles). But we've still managed 


to gain quite a bit of structure! 


7 September 19, 2018 


Let’s quickly review some important ideas: a left coset of a subgroup H of a group G is a subset C consisting of all 
elements of the form ah,a€ G,h eH. These cosets are important because they are the same size and partition the 
group: this gives us a counting formula 

|G| = [G : H]|H| 


where [G : H] is called the index of H and is the number of cosets of H. 
Definition 68 


Let @: G > G’ be a homomorphism with kernel K. The fibre of an element in G’ is the inverse image of that 


element: specifically, the fibre over z € G’ is 


o*(z) = {9 € G: $(g) = Z}. 


Proposition 69 
The nonempty fibres of G’ are the left cosets of K. 


It's actually not so important that we distinguish between left and right cosets here: 


Fact 70 


The left and right cosets of the kernel ker @ for a group homomorphism are the same. 


Proof. If x € K, then gxg~' € K, because¢(gxg~') = 6(9)6(x)6(971) = o(9)b(g~') = O(1) = 1. Thus, aHat = 
H, meaning that aH = Ha (we can write this out without symbolic notation if we want; see the proposition below). 


It is important to note that left and right cosets are not always the same, though! 
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Example 71 
Take S3 = {1,x,x*, y, xy, x2y} and let H = {1, y}. Then the left cosets are 


1H=yH=({1,y},xH=xyH= {x, xy}, x?H = x?y{x?, x7y}, 


but this is not the set of right cosets: 


ea rVe—— Ala tat xe Hx?y = {x, x?y}, Hx? = Hxy = {x? xy} 


Many of the statements we can make about left cosets are also true for right cosets, but it’s just important to 


notice that they don’t coincide exactly! If left and right cosets are the same, though, we are pretty happy. 


Definition 72 
A subgroup H of G is called a normal subgroup if for all x € H,g € G, gxg7! EH. 


An example is 41,9271; we know this is normal because it is the kernel of the sign map. 


Proposition 73 


The following are equivalent: 


* H is a normal subgroup. 


« Every left coset is a right coset. (Specifically, if a left coset is a right coset, and C = aH, then it is also 
Ha.) 


Proof. We'll do this with symbolic notation. H is normal if and only if gHg~' C H. This means g tHg CH => 
H Cc gHg™!. Thus H is normal if and only if H = gHg~t. Now multiply both sides on the right by g, and we have 
Hg = gH. 

For completeness, we'll also write this out more concretely. Suppose H is normal. Then for all h € H,g € G, we 
have that ghg-t € H and we wish to show that gH = Hg. We know that for every element h € H, gh € gH. We 


want to then show that there exists h’inH such that gh = h’g. We use a trick: write 
yx = x(x), 


so that the order of x and y flip, but y changes to a conjugate. In other words, we can move elements past each 


other, but then we have to conjguate y. Similarly, we can also move y over to the other side of x: 
yx = (yxy*)y 


and this time we conjugate x. So here, gh = (ghg~+)g, but since H is a normal subgroup, ghg-! € H. So gh € Hg, 
and we're done! The other direction is basically the same. Finally, if aH = Hb, then a € Hb, so Ha = Hb and we 


actually have aH = Ha. 


Next question: we know that kernels of homomorphisms are normal. Are all normal subgroups kernels of a map? 
The answer is yes! We can make the cosets of a normal subgroup into a group (and remember now that the left and 


right cosets for a normal subgroup are the same): 
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Definition 74 
Let N be a normal subgroup of G. The quotient group G or G/N is the set of cosets of N. This is a group with 


law of composition equal to the product set: 


C1C> = (aN)(bN) = aNbN = a(bN)N = (ab)N. 


Formally, the definition of a product set of S,7T C G is 
ST={geE€G|lg=st,se5S,teT} 
It is extremely important that the subgroup has to be normal: 


Example 75 
Take Cy = xH =4 x, xy), Go = 7H = {x xy in Sz. When, Go is mot a coset: itis (x xy, xyx-, xyxy) = 
{1, y, x*, x*y}, which has 4 elements, and 446. This is because H = {1, y} was not normal. 


The identity element of the quotient group G/N is the coset N, and the inverse of aN is a-1N. So ultimately, we 
have a homomorphism G , G which sends a— aN. The homomorphism of a then has kernel N, so WN is always the 


kernel of some map! 


Example 76 


We finish with two quick examples of quotient groups: 
+ Z/nZ is the set of (congruence classes of) integers mod n. 
+ Take G = Slo, N = {/,—/} (which is the center of SL>). Then AN = {A, —A} for any matrix A, and 


then G = G/N is a set of pairs of matrices (differing by sign). This construction comes up in other areas of 


mathematics, too. 


8 September 24, 2018 


We'll switch gears for a few lectures: the main difficulty of linear algebra is the notation. 


Definition 77 
A field F is a set with +,—,x,/ operations, where 0 (the additive identity) is not included in the x and / 


operations. Examples of fields include R, C, and F, (the integers mod p). 


Extending this definition, F” is the space of column vectors 
X=] = |,x€F. 


However, it is important to talk about abstract vector spaces, and we will do so: 
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Definition 78 


An F-vector space V is a set with two operations: addition of two vectors V x V + V and scalar multiplication 


F x V + V with the following axioms: 
+ (V,+) is an abelian group with identity 0. 
- Associativity: a(bv) = abv for all a,b € F andveV. 


+ Distributivity: if a,b € F and v,w € V, then (a+ b)v = av+ bv and a(v+w) = av-+ aw. 


Using these laws, any computation will result in a linear combination of the form 
41 Vy + aoVo +--+ + anVp 


where a; € F,v; € V. But the important idea is to try working with a finite ordered set of vectors, which we will 
denote (V1, Vo,-++ , Vp). (Using braces {} from here on out means that we only care about the set elements and not 
their order.) 


Then any linear combination can be written in the matrix multiplication form 


lv Vo t+ Vp| |. | = Vai + vod +++ + Vian 
An 


The scalar is on the wrong side, so let’s just define va = av for a scalar a. This implies that F is abelian under 


multiplication, since a(bv) = a(vb) = vba ab = ba. 
Fix S =(vw,--+,Vp). Then we can “multiply” S by a column vector X to get vix1 + Voxe +++++ VpXp. So using S 
as a basis of V, we now have a map 
pry 
that sends X — SX in a way such that W(X + Y) = H(X) + W(Y) and w(cX) = cp(X). This means that w is a 


linear transformation, and we can therefore talk about the kernel and image of w: 


Definition 79 
These properties define w but are really talking about the properties of the basis: 


- w is injective if the kernel is just the zero vector. In other words, ~(X) =0 — > X =O, or alternatively 


X#0 = SX #0y. Then (vy, v2,--+ , Vp) is called linearly independent. 


+ wW is surjective if the image is all of V. In other words, for all v € V, there exists X € F” such that V = SX. 


In other words, every vector can be written as a linear combination of S, so S spans V. 


- w is bijective if it is both injective and surjective; in particular, then (v1, Vo,--- ,V,) forms a basis for V. 


Remark 80. The idea is that V is not exactly F"; the two are just isomorphic under the map w. We don’t need to 


write a vector space explicitly in terms of basis vectors. 


(V1, V2,°** Vn) forming a basis means that for all v € V, there exists exactly one X € F” such that v = 


V1X1 + VoXg + +++ + V_X_. Then we can call X the coordinate vector with respect to the basis (v1, Vo,--+ , Vj). 


18 


This means that we can always work with F” instead of V if we have a basis, but it’s generally hard to find the 
formula X = w1V. Also, the basis is not unique; in fact, there are many bases for a vector space V, and there is 


often no natural basis. 


Example 81 


Consider an m x n matrix with entries in F. Let V be the space of solutions to Ax = 0, where x € F”. Then 


there's a lot of ways to write the nullspace (or kernel) with a basis, and there are many ways to pick one (for 


example, by row reduction). 


Example 82 
Let V be the space of functions of t such that ae +f =0. Then a natural basis is B = (cost,sint), but there 


are other natural bases too, such as B’ = (e", e~'*). 


Let's see how we can write vectors as combinations of other vectors with a matrix. Let S = (Wj, v2,--- ,Vm) be 
a basis of V, and let A be an m x n matrix with entries in F. Then SA = (SA;, SAz2,--- ,SAn), where Aj are the 
column vectors of A. 

Now let (w1,--+ , Wp) be any ordered set of vectors in V. Then any wj can be written as a linear combination of 


the basis vectors, which we can write as 


aj 
wala 
am,j 
for some a;,; € F. Thus there is a unique matrix A such that 
lia ae Goon ee ae: 


Theorem 83 


If B and B’ are two bases for V, then m= rn. Call this the dimension of F. 


Proof. We show that if (vi,--- , Vm) spans V and (w1,--- , W,) is a set of vectors in V, m< nimplies that (w1,--- , Wp) 
is dependent. This would imply that for two bases, m > n, and similarly n > m, so m= rn. From above, we know that 


we can find a matrix A such that 
lv eee Vm] A= [ma eae Wh 


which we can write as PA = Q. Now since m < n, the system of equations AX = 0 has nonzero solutions for X € F”. 
Multiply both sides of the equation by X (the solution); then PAX = 0 = QX, so the column vectors of Q are not 


independent, which is what we wanted to show. 


One final important idea is that of a basechange matrix. Basically, we can always write a new basis of our vector 
space in terms of the old basis vectors as B’ = BP, where B’ and B are vectors and P is a matrix. Then P isa 
basechange matrix that is also invertible, since we could have reversed the process and written our old basis in terms 
of the new basis vectors. 

Well, let v be any vector in our vector space. Then we can express it in both bases B and B’ as v= BX = B’X', 
so BX = B’X' = BPX' => PX'=xX. Thus, if B’ = BP defines the basechange matrix, then PX’ = X, and the 


coordinate vectors transform in the opposite direction as the bases. 
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Definition 84 


Let V,W be two vector spaces over a field F. A linear transformation is a map V +, W such that 


* T(w t+ v2) = T(uw) + T(v2) for any yw, v2 € V; that is, T is an additive homomorphism. 


«+ T(cv) =cT(v) for allce€ F andveV. 


We can immediately come up with some (simple but useful) linear transformations: 


Example 85 
Let A be an m x n matrix with entries in F. Then the map F” — F'™ taking a vector X — AX is a linear 


transformation. 


Example 86 


Let S = (VY, V2,--+ , Vm) be a vector in V. The map w: F'™ + V sends X + vwpxy +--+ + VnXm. 


Example 87 
Let V be a function of t, and let a€ R. Then the map V — R that sends f > f(a) is a linear transformation. 


We can represent any linear transformation T as a matrix with respect to a basis. Let B = (vW,--- , Vm) bea 
basis for V and C = (wi,--- , Wn) be a basis for W. Apply T to each vj. Then T (vj) is some vector in W, so it can 


be written as a linear combination of the basis vectors of W: T(vj) = CAj, where A; is a column vector. Then 
b ih = "| 
A= Ay Aa 238 A 
el a | 
describes the linear transformation, and we can say that 7(B) = CA as symbolic notation. 
Well, we have a bijective map from F™ to V and a bijective map from F” to W. So essentially, the matrix from 


F™ to F", denoted by A, is basically the same as the linear transformation T : V + W. We'll draw a diagram to 


explain why T(B) = CA. Here’s the general form: 


and here’s the form with a specific vector X: 


See era Pg 


Pl 


By" «. CAX 
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Definition 88 


The kernel or nullspace of a linear transformation T is the set of vectors v € V such that T(v) =0. The image 


of T is a subspace of W, defined to be the set of all w © W such that there exists some v with T(v) = w. 


Basically, we're using the same definitions as before. 


There are alternative ways to describe these spaces: in the language of F™ and F” and describing our linear 
transformation in matrix form, the kernel is all X € F'™ such that AX = 0. In other words, this is the set of solutions 
to a homogeneous linear equation. Meanwhile, the image is all Y € F” such that we can solve AX = Y; this Is also 


the span of the column vectors of A. 


Theorem 89 


Let @:V +, W be a linear transformation. Then 


dim V = dimker@é+dim|Im@. 


Proof. Let (uy,--- , Ux) be a basis for the kernel, and let (wi,--- , w,) be a basis for the image. Let vy, € V be the 
vector such that T(vj) = wj (this exists because wjs are in the image). We will show that (U1,--- , Uk, Vin- +, Vr) 
form a basis for V, which implies the result. 


First of all, we show that this is a linearly independent set. Say that we have 


ayUy +--+ + aug + byyy +--+ + bv, = 0. 
Apply T to this set; then the first k go to zero, and we are left with 
byw, +---+ bw, =0 


and since w; formed a basis for the image, all b; are equal to 0. Now going back to the original set, a; must all be 
zero since u; form a basis for the kernel. 
Now we show this spans V. Pick some arbitrary vector v € V, and let w = T(v). We can write w as a linear 


combination of the vectors in (w1,--- , w,) because it is in the image: 


T(v) =w= byw, +--+ + bpw,y. 


Now take the corresponding vectors in V: let v’ = by vy + bovo+---+b-w,;, so that T(v’) = T(v). Thus T(v—v’) = 0, 


so v — v’ is in the kernel, so we can write 


VV = aly +e + ale => V= aly +--+ + age + brim +++ + bw, 


and now we have found a way to write v as a linear combination of the set we wanted, so the set spans V. 


Since (Uy,+++ , Ux, Wi,+** , Wr) is linearly independent and spanning, it forms a basis. 


Notice that this is really similar to the counting formula: |G| = | ker || Im@]. In particular, it’s the same formula 
in some cases! Consider F = F,, the integers mod p, and consider some homomorphism @. Then dim V = n, ker @ = 
k,lm@=n—k, and p"= p*p”*. 

We'll shift gears now and consider a change of basis. Let B and B’, C and C’, and A and A’ be old and new bases 


of V,W, and matrices representing T, respectively. Then recall that we can write B’ = BP,C’ = CQ for invertbiel 
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matrices P,Q. Since A is defined by T(B)X = CAX, substituting in B = B’P-1,C = C’Q™?, we are left with 
TBP Vea oO A= Ter =ce A 


so we know that T(B’) = C’(Q-+AP). But we also have that 7(B’) = C’A’, so we actually have the following result: 


Fact 90 
Let A represent a linear transformation from V to W. Then A’ = Q-1AP is the change-of-basis matrix for A if 


P and Q are the change of basis matrices for V and W, respectively. 


Recall that the elementary matrices generate the general linear group: this means that we can do arbitrary row 
operations on A, as well as arbitrary column operations. (doing either row or column first). This means that we can 
make A’ very simple — In fact, we can make it almost an identity matrix with some zeros on the outside. We'll come 


back to this later. 


10 September 28, 2018 


Consider some linear transformation V + V with a matrix A: B > B. (This is also known as a linear operator.) 
This transformation T has the property that if X is the coordinate vector of v € V with respect to the basis B, then 
Y = AX is the coordinate vector of T(v). 


Example 91 


Rotate the plane by an angle 6. Then the corresponding matrix is of the form 
cos@ —sin@ 
sind cosé |. 
Notice that the first column is the final location of the first basis vector al and the second column is the final 
é : 0 
location of the second basis vector ak 
Example 92 


a 
Let P be the space of polynomials in t with degree at most 3. Then the map of differentiation P *$ P is a linear 


operator. Taking our basis to be (1, t, ae E), the matrix is of the form 


0 
2 
0) 
0 


Remember that in general, we can change our basis to be more convenient. If we change both the new and old 


basis from B — B’ = BP, then our new matrix will be of the form 


A’ = P~1AP (conjugation by P~!) 
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In other words, we have to do a row operation, as well as a column operation, to get to our final matrix. 


Example 93 


a 
wea 
Cc 


b lal. 
, and let E be the elementary matrix meal 


Then E-!AE will be 
1 —x| Ja by} ]l x} ja-—xc xa—xcx+b—xd 
0 1 c d|]jO 1 Cc cx+d 
This is not super clean. However, we can pick x appropriately so that a— xc or cx+d is zero, as long as c #0. Thus, 


b 1 0 0 b 
we can write the matrix as s) and then conjugating by ; | we end up with a matrix of the form : ; 
Cc 


(es 
for new b, d. 


Fact 94 


This is called the rational canonical form. This is the best we can do: we can fix where one of our basis vectors 


go, but not both. 


Fixing the first column is actually related to the concept of an eigenvector: 


Definition 95 
Let T bea linear transformation. An eigenvector of 7 is a nonzero vector v € V such that Tv = Av for some 


AER. A is the corresponding eigenvalue of v. 


In particular, if we pick (v1,--- , V7) to be a basis of eigenvectors, then the matrix A is diagonal with entries equal 


to the eigenvalues. (This is really nice!) 


Example 96 


2 
Let V = R?, and let T be the transformation represented as multiplication by the matrix A = ; 


1 
. Thus, the first quadrant 


1 2 
This sends the standard basis vectors A and a to the columns of A, A and 


: . ; : 5 
gets sent to a more narrow slice of the first quadrant. Apply the transformation again: A? = = tel and we have 


an even more narrow slice. Keep doing this, and we get a line - that’s a line of eigenvectors! We can state this as a 


more general result: 


Theorem 97 


A real matrix with positive entries has a ray of eigenvectors with all positive entries. 


The “proof” is basically the same for more dimensions: we apply our transformation repeatedly. 
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Example 98 


The eigenvectors of the differentiation map are of the form (c,0,0,0,---) with eigenvalue 0. (Yes, this is an 


allowed eigenvector.) 


In general, how do we find eigenvectors? First, find the eigenvalues! Assume x is a nonzero eigenvector with 
eigenvalue A: then 
Ax = rx = (A-Al)x =0 = det(A-A/) =0 


(A — Al must be singular to have a nonzero solution, so its determinant is 0). 


Example 99 


a b 
The eigenvalues of a 2 by 2 matrix must satisfy (a—A)(d—A)-—be =0 = > A?—-(at+d)A+(ad—bc) = 0. 
Cc 


So there are at most two eigenvalues for a 2 by 2 matrix. 


a-+d is the trace of the matrix; it is the sum of the diagonal entries but also the sum of the eigenvalues. 


Example 100 


sin@  cos@ 


Let R be the rotation matrix 


cos@ —sin / 


Then the characteristic polynomial is \? — (2cos@)A + 1, which has no real roots unless cos @ = +1. Then we get 


eigenvalues cos 6 + /cos2 @ — 1 = cos@ + isin@ = e+’®. This results in corresponding eigenvectors are sls 
aif 


In general, how would we find the determinant of t/ — A (flipping the sign so the characteristic polynomial is 


nicer)? We'll call this polynomial p(t). The leading coefficient is t” (from the diagonal), and the t”~* coefficient 
is —}> aj;, which is the negative of the trace of the matrix. (All the coefficients in the middle are a huge mess by 
Vieta’s formula.) Finally, the constant term is easy: it’s just (—1)" det A, since the ts cannot contribute. Thus, the 
characteristic polynomial p(t) of an n x n matrix is a degree n polynomial in t with some additional nice properties. 
In particular, consider the case F = C. In general, the characteristic polynomial p(t) has n distinct roots, and this 


gives us a nice result: 


Lemma 101 


Let v1,--+ , V, be eigenvectors with eigenvalues A1,--- , Ax. If all As are distinct, then (v1,--- , Vv.) are independent. 


Thus, we can form a basis with our eigenvectors. 


Proof. Suppose | a3v1 + aoV2 +--+ + axV_ = 0}. We wish to show that a; = 0 for all 1 < 1 < k. We induct on k; this 


is clearly true for k = 0,1. 
For the inductive step, assume this is true for all values less than k. Thus, vyj,--- , V¢—1 are independent. Apply 


the linear transformation T; now T(0) = 0, so 


So Aaiv; = So aii = 0). 


Now multiplying the first boxed expression by Ax, So dk aivi = 0]. Subtracting these, the Axaxv, terms cancel, so 


we now have 5° aj)(Aj — Ax)vj = 0 for 1 < i < k—1; Aj — Ax Can't be zero, since all eigenvalues are distinct. Since 
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V1,°°* ,Ve—1 are independent by inductive assumption, a; = 0 for all 1 < 1 < k—1, and therefore a,y,,.=O0 => a, =0 


as well, showing that the v; are independent. 


11 October 1, 2018 


Let V cs V be a linear transformation, and let A be a matrix with respect to some basis. Recall that the charac- 
teristic polynomial p(t) = det(t/ — A) is a degree n polynomial in t, such that the coefficients of t",t”~!,1 are 
1,—-tr A, (—1)" det A, respectively. 

Notice that the polynomial is independent of basis, because the roots are always the same (they are the eigenvalues 


of the linear transformation). 


Corollary 102 
Under a change of basis (so A’ = PAP™), the trace of the matrices A and P~!AP are always the same. This 
also means that the trace of AB and BA are always the same: set A= BA, P= B. 


Example 103 


ao eDie 0 
Recall that we can always get a matrix A = | into the form A’ = [ ae We can now see that we are 
G 


stuck (and can't simplify further) for sure: d’ must be the trace of A, and —b must be the determinant of A. 


Let’s assume that our vector space V is over the field F = C. If p(t) has all distinct roots A1,--- ,An, and vj 
are their corresponding eigenvectors, then (v1,--- , Vn) form a basis, and A is the diagonal matrix (in that basis) with 
entries A;. Most polynomials have all distinct roots, so this is usually fine. 

But suppose our characteristic polynomial has a double root; for example, let dimV = 2. Then the characteristic 


polynomial is (t — A), and we can pick an eigenvector vo. If (v1, v2) is a basis for V, in that new basis, we have 


rA 0 . ; . 
A’ = for some constant c, and we can't always diagonalize our matrix. In particular, if c 4 0, then vo and 
Cc 


all of its multiples are the only eigenvectors, and we have a degenerate case, since 
rA OO] |x x 
— — aN 
c Ally y 


only if c =0 or x = 0; if c £0 then all eigenvectors are , which in this basis means they are multiples of vo. 
y 
So how do we deal with these degenerate cases? We need the notion of decomposing a matrix into blocks: 


AX 
cx + Ay 


Definition 104 
Let W, and W> be subspaces of V. V is called the direct sum of Wi, and Ws, denoted V = W, @ Ws, if all ve V 


can be written as v = w, + Wo in a unique way (where wy € Wy and w2 € Ws). 


It is equivalent to say that Wy A W2 = {0} and that W, + Ws =V. 


Definition 105 


Let W CV bea subspace. Then W is T-invariant if TW CW. 
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Suppose V = W, ® Wa, and W,, Wo are T-invariant. Then let (v1,--- , Vc) be a basis for Wy and (Ve41,°++ , Vp) bea 


basis for W>. Then we can write the matrix of T with respect to this new basis as 


A 0 
0 D 
where A and D are square matrices. This is because v1,--- , Ve all go to some linear combination of the first c basis 


vectors, and similar for Ve44,°°+ 5 Vp- 


Definition 106 


A Jordan block is any of the following matrices for a fixed 2: 


4 0 0 
ali} Vip oe 
<i 


Theorem 107 (Jordan normal form) 
Let V 4 V be a linear operator on a complex vector space. Then there exists a basis such that the matrix is 


made up of Jordan block matrices: 
J; 
Jo 


Jk 


where the A; for each J; may be equal to each other or not. 


This has a lot of zeros, and in some ways it’s the “simplest” way we can represent our matrix. 


Proof. Choose an eigenvalue » of 7. Replace 7 by T — A/; now we can assume T has eigenvalue 0 because TJ is not 
invertible. We will construct a Jordan block for X. 

Let N, = ker T; this is not the zero space (for the reason above). Let Nz = ker T2, and so on. Anything “killed” 
(sent to zero) by T is certainly killed by T?, which is certainly killed by 77, and so on, so we have a chain of subspaces 
{O} C Ny C No C---. Similarly, let Wy = Im 7T,W2 = Im T?,--+. Fora similar reason, if something can be written as 
T(T(v)), it can be written as T(v). Thus, we also have the chain of subspaces V DW, D W2D---. 

These are infinite chains, but we have a finite dimensional vector space. Thus, there exists some k such that 
Ng = Neyt = +++ = N, and We = Wha = Wep2 = ++: = W. By the dimension formula, dim N; + dimW; = dimV, so 


it is also true that dim V = dim N +dimW. Now we need to prove a quick result: 


Lemma 108 


N and W are T-invariant, and V=NOW. 


Proof of lemma. N and W are T-invariant by definition (we got to the constant part of the sequence). In particular, 
everything in the kernel of T* is still sent to something in the kernel of T+, and the same idea is true for the 
image. By the dimension formula, if we can show that the intersection of N and W is trivial, we will have shown that 
V = NOW, since the basis for N and basis for W gives everything for V. 
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Assume there is some x € NOW. Then x = Tkv for some v, but also T*x = 0. So T*(T*v) =0 Ty =0, 


so v € Nox. Thus v € Nx as well, so T*v =0, sox =O, completing the proof. 


So by the work we did above, we can choose a basis for V such that the first part is for N and the second part is 


for W. By block decomposition, we can now write our matrix as 


A 0 
0 D 


Since N is not trivial, A has size at least 1, so now we induct on D: repeat the process again with our new eigenvalue. 
So all we need to do is show that A is a Jordan block. 


Since A is the matrix for T on the space N, there exists some k such that T* = 0; this means T is nilpotent. To 


be continued... 
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Today, we'll take the field F = R. 


Definition 109 


Define the dot product of two column vectors as x- y=) xjyj =x! y. 


This has the following important properties: 


¢ It is linear in x and y. 
« It is symmetric. 
ox xX=xXPt--- 4x2 = |x|? 


+ x-y = |x|ly| cos 6, where @ is the angle between the vectors x and y. This is true in R”, but we have to be more 


careful with defining an “angle.” 


The proof of this last statement is the law of cosines: make a triangle with vectors xX, ¥Y,X — y. Then 
Ix — yl? = |x]? + ly? — 2|x|ly| cos@ = (x — y) - (x —y) = |x? + ly? - 2x-y 


and the result follows. 


Definition 110 


Let (V1,--- , Vn) be a basis of R”. Such a basis is orthogonal if v;- vj = 0 for all / 4 Jj, and it is orthonormal if 


Ow 
7 


Notice that the standard basis is orthonormal (all vectors are orthogonal and all have length 1). 


Lemma 111 


If vz,--+ , Ve are orthogonal and nonzero, then they are independent. 
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Proof. Assume we have a linear combination cyvy +--+ + CyVve =O. Then for all /, 


O=Vy-(aMut--+M) =U (ay)=c, 


so cj = 0. Thus the only linear combination is the trivial one, and thus the vjs are independent. 


Definition 112 


A linear operator T : R” > R” is orthogonal if T(x)-7T(y) = x- y for all x, y. 


Let A be the corresponding matrix for T. Then we can rewrite this as 
(Ax): Ay =x-y => (Ax) (Ay) =x-y = x’ATAy=x’'y, 
so A’A=1 is a sufficient condition for A to be orthogonal, and it turns out to be necessary as well: 


Lemma 113 
x!’ My = x'y for all x, y if and only if M=/. 


Proof. It’s pretty clear that the equation is true if M = /. For the other direction, let x = e, y = ej (the vectors 
with 1 in that entry and 0 everywhere else). Then x! My is entry Mj, but it is also equal to e] e, which is also the 


Kronecker delta (1 if / = and 0 otherwise). Thus M must be the identity. 


Definition 114 
A square matrix A is orthogonal if A’ A = /, or equivalently that A’ = A-t. 


This is equivalent to having an orthogonal operator, and it is also equivalent to having the columns of A forming 
an orthonormal basis. 

It turns out that the orthogonal matrices form a group O, C GL,! This is pretty easy to check: if A’A = 
/,B'B =I, then (AB)’AB = B'ATAB = B'B=1,50A,B €O, = > AB €O),. | is clearly in the group, and 
AT = A-! = A-}! = A by taking the transpose of both sides. 


Fact 115 


Notably, det A’ = det A, so the determinant of an orthogonal matrix is + 


Definition 116 


Let SO, be the special orthogonal group, consisting of the elements of O, with determinant 1. (SO, creates 


two cosets in Op.) 


Let's describe these orthogonal matrices in dimension 2 and 3: we want all the column vectors to have length 1 


; . . cos@ 
and be orthogonal. In dimension 2, any vector of length one can be written as ; 


sin@ 
cos@ b 
A=] | . 
sin@ d 
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| , SO Our matrix must be written 


in the form 


cos@ —sin@ - = cos@ sin@ 
are good. Then R' = R™* = , and by 


Notice that rotation matrices work: R = / 
—sin@ cosé 


sin@  cosé@ 


. : oy FL 
closure, R'’A is an orthogonal matrix as well. But the first column of RTA is | so the second column must be 


0 
i" Plugging this back in, we find that A must be of one of the two forms 


where c = cos6,s = sin@ for some 6. Thus Oz is the set of rotation / reflection matrices, and SO2 is the set of 
rotation matrices. Let R be SO2 and S be Op \ SOx; these are the two cosets of SO>. 


Fact 117 


Consider the characteristic polynomial of a matrix M € S. By direct computation, the polynomial is always A? — 1, 


so there is an eigenvector x with eigenvalue 1 and an eigenvector y with eigenvalue —1. Then Mx = x, My =—y, 
so 
Pe NAY i) ee 


So x and y are orthogonal, and therefore we can describe any matrix in S as fixing a line and reflecting everything 


else over it. 


It turns out that the angle a of the fixed line of eigenvectors from the x-axis is a = $0. This is because we can 


get from a matrix in S to a matrix in R (a rotation matrix) by changing the sign of the second column. This is a 


eee ee 1 0 : ‘ : : : ‘ 
multiplication by E | , which flips over the x-axis. So a vector p on the line of reflection at a is sent to —a. But 


then rotating back by the matrix in R gets us back to p. Thus, a — (—a) = 6, and a = 40. 


Theorem 118 


Any matrix A € SO3 is a rotation of R* (where we fix a line as an axis of rotation and rotate the orthogonal 


plane by some angle). 


This is not true in R*, since we can take two orthogonal planes and rotate them in different angles. 


Proof. First, we will find a fixed vector with eigenvalue 1. We do this by showing det(A — /) = 0. We know that 
det(A’(A— /)) = det(A— /) = det(/ — A’), since det A’ = 1. So det(A — /) = det((/ — A)") = det(/ — A). But 
the determinant of the negative of a 3 by 3 matrix adds a factor of —1. Thus det(A—/) = 0, so A = 1 is an 


eigenvalue. 


The rest of the proof is using this eigenvector of eigenvalue 1 as one of our basis vectors in an orthonormal basis 


— to be continued. 
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Definition 119 


Let F be a figure in the plane. A symmetry of F is a bijective, distance-preserving map from F ae = 


Example 120 
Consider the symmetries of a pentagon. There are 5 rotations, including the identity, and we can reflect over one 


of the 5 lines of bilateral symmetry. This is a total of 10 symmetries. 


Fact 121 


The symmetries of a figure F always form a group. 


This is not hard to prove: the identity is a symmetry, we can invert a symmetry, and we can compose symmetries. 


Often, we can think of extending the symmetry to the whole plane. 


Definition 122 


An isometry of R” (mostly n = 2) is a map R” 4, R" that is distance-preserving: we have 
d(f(x), F(y)) = d(x, y)V¥x,y € R” 


where d(x, y) = |x — y|. 


Isometries are bijective maps, though this may not be obvious. 


Example 123 


Any orthogonal linear operator @ is an isometry. 


This is because we know that $(x)- d(y) = x-y, so 
I(x) — (YF = (604) — O(Y)) - (60) — 61”) = O(« - y)- « — y) = («- y)- (ky) = |x - PP, 
and distances are indeed preserved. 


Example 124 


Translations tz by a vector a of the form t,(x) = x + a are also isometries. 


(It is pretty obvious that these preserve distance: t,(y) — ta(x) = y — x.) 


Theorem 125 


Let f be an isometry, and suppose f(0) = 0. Then f is an orthogonal linear operator. 


Proof by Sharon Hollander. Orthogonality is easy: since |f(x) — f(y)| = |x — y|, setting y = 0, 


(F(x) — F(0)) - (F(x) — F(0)) = (x — 0) - (x — 0) => F(x): F(x) = x- x 
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So now, we can expand (f(x) — f(y)) - (f(x) — F(y)) = (x — yy): (x -y): 
(x) - (x) — 26(x) - Oy) + O(y) OLY) = xx — 2x-yt+yry 


and canceling, 6(x): d(y) =x-y. 
Showing that the operator is compatible with + is a bit harder. We want to show $(x + y) = 6(x) + @(y), which 
is the same as saying $(x + y) — $(x) — d(y) = 0, so the length must be 0. Thus, we want to show 


(d(x + y) — (x) — b(y)) : (O(« + y) — (x) — O(y)) = 0 


Expanding this out, 


P(x +y)- d(x + y) + (x) d(x) + OLY) (LY) — 2b(% + y)O(x) — 26(x + y) bly) + 26(x) G(y) = 0 


Since we know ¢ is orthogonal, we can basically drop all the @s: this becomes 


(xty):-(xty)+x-xty-y-2(xt+y)-x-2(x+y):-y+2x-y 


and factoring, 


(e+ y)—x=-y) (k+y)—x-y)=0 


which is true. So now follow all of the steps in reverse! 


Corollary 126 


All isometries are a composition of an orthogonal linear operator and (then) a translation. Specifically, f(0) = 


a => f =t,0 ¢ for some orthogonal operator @. 


Proof. Define ¢ = t-?0 f. Then $(0) = 0 and Theorem 125 tells us that @ is an orthogonal operaotr. 


In particular, since translations and orthogonal operators are both bijective, we've shown that all isometries are 
bijective. 


So now, let’s look at the special case IR?. We know that the orthogonal operators are either of the form 


Cc -Ss 
Pe = , 
Ss ¢ 


where c = cos@,s =sin@, or 


taking r to be the standard reflection matrix ; ‘| 
So all isometries of R? are of the form t,09 or taper. There are some rules that help with composition: 
* tath = toto. 
* Pals = Pa+tp- 
- r? =1, the identity isometry. 


* We know that pgta(x) = pe(x + a) = pe(x) + pe(a). Thus, pgts = toga) Pe- 
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* Similarly, rta = trcayr. 

* Finally, rog = p_er. 

It's more interesting to think of geometric descriptions of these isometries. If there is no r, then we can either have 
tz, a translation, or t,09, which turns out to be a rotation of the same angle @, just about a different point. These are 
the orientation-preserving isometries. 

Then we have the orientation-reversing isometries: gr is a reflection over angle 56, and finally, tzper is a glide 


reflection: 


Definition 127 


A glide reflection is defined by a glide line and a glide vector parallel to it. The glide reflection first reflects about 


the glide line and then translates by the glide vector. 


Let's try to write this out. What does it mean to rotate an angle @ around some point p? We first apply a 


translation so p is at the origin, then rotate, then translate back. Thus, we have 


Po,p = tpPet_p = tptp,(—p)Pe = tpt_p(p)Pe = tp—p(p)Pa- 


So to show that t,9 is a rotation about some point for @ # 0, we just need to show that a = p — p(p) for some p. 


Well, let R be the matrix corresponding to p. Then 
a=p-—p(p)=(/—R)p 


and as long as / — R has nonzero determinant, it is invertible and we can find the point p. Well, let’s write it out 


explicitly: 


l-c s 5 5 
det =1—26+ e+ s°=2—2¢ 
—s l-—c 
so this is not zero unless cos @ = 1, which means 6 = 0. But we assumed @ 4 0 (otherwise we just have a translation), 
so | — R is invertible where relevant. 
By the way, there is also a nice geometric construction for finding p such that p — o(p) = a. Basically, place the 


angle @ on the perpendicular line to a. 


14 October 12, 2018 


The quiz grades will be posted by tomorrow. Also, there is a student who has not taken the quiz yet, so the solutions 
will be posted after that. 


Here’s a picture of a plane figure F (it is infinite in all directions): 
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There are no immediate lines of reflection for this shape, unlike the normal brick pattern, which has two lines of 


reflection. Let’s think about the group of symmetries: what transformations take this back to itself? 


Fact 128 


We can pick the center of any brick and rotate around its center by 180 degrees. Also, we can do a translation 


by integer combinations of two different vectors a and b: denote this as aZ + bZ. 


So the symmetries of our figure F are isometries of the plane P such that F +, F carries the figure to itself. In 
other words, the group of symmetries of F are a subgroup G of the group M of all isometries of P. 

Recall that we have elements of MV of the form m= t,09 or m = tzper, where (g is the rotation by @ around the 
origin, and r is the standard reflection over the x-axis. In the first case, if @ = 0, this is a translation, and otherwise, 
it's a rotation obout some point p € P. The second case consists of glide reflections: reflect over some line and then 
glide by a vector in the parallel direction. In particular, if we decompose a into parallel and perpendicular components 
to the line of reflection of pgr, the actual glide line goes through the midpoint of the perpendicular component. 


Notice that changing the origin that we're rotating around adds a conjugation factor to our isometries. 


Fact 129 


There is an important homomorphism M +> O>, where O> denotes the set of orthogonal operators. Basically, 


take any t309 — Oe, and take tz0er — per. 


In particular, we send t,;6 — @ for any orthogonal operator ¢. Remember that the group of isometries is not 
abelian! 
It turns out this is a homomorphism, and we'll show that now. Recall the rule for moving translations past 


operators: tz = tga). So 


W((tab)(top)) = m( tatoo) Oh)) = OY = 1(tab) (tev) 


since @wy is an orthogonal operator. This proves that 7 is always a homomorphism, and the kernel of 7 Is all translations 
T of the form {tj} where a€ V = R?. 


Remark 130. Here, we should make a distinction between V, the plane of vectors, and P, the plane of points. t acts 
on P via ty(p) = p+; notice that we pick a specific zero vector or origin for V, but we do not need to do so for the 


plane of points P. 


Let’s make a small assumption about the set of symmetries for our brick figure: we want it to be discrete. 


33 


Definition 131 


A subgroup G of M is discrete if it does not contain arbitrarily small rotations (about any point) or translations. 


In other words, there exists some €, such that |v| > €, if ty is in our subgroup of symmetries and v 4 0. Similarly, 


there exists some €2 such that |6| > €o if pep = tabe € G and 6 #0. 


Example 132 
The symmetries of a circle are not discrete: we can rotate by arbitrarily small angles. Similarly, the symmetries of 


a line can be translated by any arbitrarily small length in the direction of the line. 


So if G is a discrete subgroup of M, and M 4 Oo is a homomorphism with kernel T, we can think about the map 
G a G, which is the image of G under m in Oz. Then the kernel of 7’ is exactly the translations Tg = 7G in our 


discrete subgroup. 


Definition 133 


The image of G under 7, denoted G, is called the point group of the group of symmetries. 


What does the point group “remember” about G? Any rotation in G becomes just information about the angle of 
rotation; the point which the rotation was done about is lost. Similarly, in a glide reflection, we remember the angle 
or slope of the glide line a = 50. (The translation is completely lost.) So now, let’s call the corresponding rotations 
and reflections pg and (gr. 

So now the point group is easy to find. We know that in a standard brick pattern, we can rotate by 7 (either 
from the center of the brick or an edge, but those are equivalent). Thus for the standard brick pattern, the point 
group 

G ={1,Py.7. BxF} : 
we can either reflect over a line or rotate by 180 degrees. This is Dz, the dihedral group, which is equivalent to the 
Klein four group. 

What about the plane figure F? It’s a bit more confusing to visualize, but we have a 180 degree rotation about 
any center of a brick, which can be denoted p,, and we actually have a glide reflection at a 45 degree angle. So it 
turns out that this brick pattern also has the same point group G! However, the actual groups of symmetries G are 
different for the two symmetry groups. 

What more can we say about the kernel 7? If G is a discrete subgroup of M, the corresponding point groups G 
are discrete subgroups of Oo. In addition, Tg = {tz € G} is also a discrete subgroup of M. Specifically, t; <=> a 
gives a correspondence between the set of translations Tg and the set of points Lg C V = R?. In other words, since 
Tg is discrete, L is a discrete subgroup of V+! In particular, G being discrete is equivalent to G and L both being 
discrete. 


So what are the discrete subgroups of Oz and of V+? We can describe them pretty easily here: 


Proposition 134 


If L is a discrete subgroup of the vectors V* = R?, then L is either {0}, aZ for some vector a, or aZ + bZ for 


a, b linearly independent. 


If G is discrete and Lg = aZ+ bZ, then we call G a plane crystallographic group. In discrete groups where the 


translations are aZ + bZ in two different directions, then there are 17 possible groups! In three dimensions, there are 


34 


more than 200 of them. If we want, we can search up crystallographic groups, but they're pretty annoying and not 
that useful. 


Proposition 135 


If G is a discrete subgroup of Oz, then it is the cyclic group C, generated by ~, where 6 = 2m or Dry, the dihedral 


group, which can be represented as {f,, D,?’}, where r’ is a reflection. 


Dy, is the symmetry group of a regular n-gon. For example, Ds gives 10 symmetries of a pentagon: rotate in one 


of 5 ways, or reflect in one of 5 ways. 
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We will continue talking about plane figures. Like last time, let G be a discrete subgroup of MW, which is an isometry of 
the plane P. Recall that we have a homomorphism a : MM — Oz, which sends an isometry tz09 — Pg and tz0er > DoF: 
basically, it drops the translation. Then G C M, so the point group G C Oz is an important group to study. 

Recall that the kernel of a is the set of translations in G, and we can define L = T 4G to be the set of vectors 
{v € R? | t, € G}. We were only dealing with discrete subgroups, so L is discrete. We also know that G, the point 


group, is either the cyclic group of n elements C,, or the dihedral group of 2n elements Dp. 


Fact 136 (Simple examples) 


C, is the trivial group, D, Is the group generated by Tf, and C> is the group generated by p,. 


Also recall that in a plane, L is either trivial, aZ for some vector a 4 0, or aZ+ bZ for linearly independent vectors 


a, b. In this last case, we have a lattice. Let’s consider some special cases: 


Example 137 
What if L = 0? In other words, what if the kernel of G “. G is trivial? 


Then the only translation in the group of isometries is the identity. (An example of this is a pie.) Here, G + G is 
bijective, so G Is also the cyclic or dihedral group. As a fun fact, dihedral groups D, are the symmetries of an n-gon 


in the plane. 


Lemma 138 
If G is a finite group of isometries of the plane, and L = 0, then there exists a fixed point p € P such that gp = p 


for all g in G. 


This is not obvious; how do we know that all rotations, reflections, and so on all keep a point fixed? 


Proof. Start with an arbitrary point q € P, and define its orbit to be {q1,--- , qv} = {gq |g € G}, where N = |G|. 
Basically, multiply q by all possible elements in G (which means that we look at all possible images of gq, counting 


multiplicity). Then the fixed point will be the center of mass or centroid 


=F (g++ + aw) 
p= yy ha QNn)- 
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To show that p is fixed, we'll show that if g is any isometry, and qi,---,qw are any points, the centroid of 


{gqi,--:,9qn} iS gp, where p is the centroid of gi,--- , gy. This is enough to show the result, because multi- 
plying g1,-:: , Qn by an element g € G will result in those same points in some other order, so their average p remains 
constant. 


It is enough to show this for the two cases where g is a translation or an orthogonal operator. If g = t,, then 


+ ota = a4 v= 85 a= te. 


Meanwhile, if g is an orthogonal operator, we just use the fact that g(x + y) = g(x) + g(y). 


ty =q tv, and 


Note that the point group G operates on L: 


Lemma 139 


If vel andgé€G, then gv EL. 


Proof. |lf v € L, then there is a translation ty € G. If g € G, then there is an element tag € G (where g is orthogonal). 
Conjugating ty by tag, 


(tag) tv (tag) A= ta(gtv)9 ae = tatgvgg ‘ta = tg, 


so gveEL. 


However, G does not necessarily operate on L. For example, it’s possible that G contains only glide reflections but 


not pure reflections. 


Example 140 
Now let's look at the other extreme. What if L = aZ+ bZ, and we have a crystallographic restriction on our 


isometries? 


Theorem 141 


If G is a discrete subgroup of M and L is a lattice, then G is C, or Dy, with the additional restriction that we 


must have one of n=1,2,3,4, 6. 


In other words, there are no crystals with five-fold rotational symmetry! There does exist a quasi-crystal, but it 


does not have translational symmetry. 


Proof. Choose a € L to be (one of) the shortest vector(s) that is not the zero vector. This exists because we have a 
discrete set of symmetries. Let's say that Jy € G: by the previous lemma, G operates on L, so pga € L. 

But L is a lattice, and it is closed under addition, so a — pga € L as well. Since a was the shortest vector, the 
length of a— pga must be at least as large as the length of a, and if @ < 4, this is not true. Since 0 = 2m we must 
haven <6. 

Now, for n = 5, there is a separate argument. Again, pick a to be a shortest vector, and also consider pa, p’a 


(where p is rotation by 2m) Now notice that |a+ pa] < |pa| = Jal, which is a contradiction, so five-fold symmetry is 


not possible either. 


So we have already found 10 groups for G, and it turns out there are 17 groups G in total that result from this. 


Let's do an example of a computation in which we find G from G: 
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Example 142 


Say G contains p = Pr/21 SO G = C, or Dy. What are the possible groups G? 


Let’s say L 4 0 (or the group is obviously C4 or D4). Then picking a to be the shortest nonzero vector in L, 


b=pae€ Las well. And now any integer linear combination of a and 5b is also in L. 
Claim 143. L = aZ-+ bZ, which forms a square grid. 


Proof of claim. We know that L contains aZ + bZ, and if there were other points, there would be some v inside our 


square grid. Then the distance from v to the closest vertex is smaller than that of the side length, contradicting the 


minimality of [a]. 


Now, the elements in G that map to # in G are rotations of 5 about points in the plane. Choose our coordinates 
1 0 
so that 0 is the rotation point and the side length is 1. So now a= A ~b= a wand L = Z?. 


We can now let H = {ty | v € L} be the group of translations. If G = Cy, then the index of H in G is 4, and G 
will be the union of the cosets 1H U pH U p?H U pH (that is, combining a 0, 90, 180, or 270 degree rotation with a 
translation in Z*). We also know how to multiply using the isometry rules, and the group is determined. For the same 
reason in general, such a group of isometries is defined for all cyclic groups Cy, Co, C3, C4, Ce. 

Meanwhile, if G = Da, things are more complicated. Then 7 € G, so there exists some vector n such that tyr € G. 
Ideally, we want to show that r in g, but we only know that for all ve L = > t, € H, we have tytyr € G. Thus, we 
can only move our vector n to some vector u in the square generated by a and b (it might be the origin, but it might 
not be). 


uy 


Now tyr € G, so (tyr)* = tytpur? = tury € G. Let u = } We can calculate what u must be: u+ ru must 


ug 


; 2u ; : 
be in L, so 1 € L: this means u, = 0 or 5. We can do some more work and find that this only corresponds to 


1 
two possibilities: u is the origin, or u = Fal u being the origin means that r € G, which is good, but the other 


case is more confusing. t,r is a glide reflection along the line ; of the way up the square grid (horizontally), and then 
translated to the right by 5. To visualize a group with this symmetry, around all points (Z, Z), put a counterclockwise- 
oriented symbol, and around all points (Z + 5,2 + 5), put a clockwise-oriented symbol. Indeed, the point group here 
iS D4. 


16 October 17, 2018 


Today, we'll start talking about group operations. We'll start with some examples: let G be a group of isometries of 
the plane P. G operates on the set of lines in P: if f € M is an isometry, and L is a line in P, then L is sent to FL. 


The isometries also operate on triangles: A gets sent to fA. 


Example 144 


Let G = Ds, the symmetries of a regular pentagon. Then Ds operates on the vertices (and also on the edges). 


However, group operations don’t have to be geometric. 
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Example 145 
Let V be the Klein Four group {1, a, b,c}, defined such that a = b? = c? = 1,ab = ba=c, bc = cb=a,ac 


ca = b. Switching any of a, b,c around doesn't change the properties of the group, so S3, the symmetric group, 


operates on V, the Klein Four group! 


Fact 146 
A game was played in the Kennedy days where the answer would be given and you had to come up with a question. 


The answer was “OW.” Well, the question was, “Do you spell your name with a V, Mr. Wagner?" 


Time to actually give a formal definition of what's going on: 


Definition 147 
Let G be a group and S be a set. An operation of G on S is a map G x S > S that sends g,s > s’ = gs, with 


the following properties: 


+ ls =s for all s € S; that is, the identity does nothing to S. 


* g1(g25) = (9192)s; the associative law holds. As a corollary, g~'gs = s, so inverses also do inverse 


operations on S. 


+ gs = gs’ if and only if s=<9’. 


Notice that this means that operations permute the set S (since they have “inverses’)! Also, notice that gs = g's 


does not tell us that g = g’. 


Definition 148 
Let G operate on S, and let s € S. Then the orbit of s is the points 


Orbit(s) = {s' € S:4geEG,gs=s'} 


Example 149 
There is always an isometry that sends any line in the plane to any other line. This means that the orbit of any 


line is the set of all lines. 


Example 150 
On the other hand, the orbit of any triangle in the plane is the set of all triangles congruent to it. (We have to 


preserve lengths and angles under isometries. ) 


If Orbit(s) is the whole set S for all s, then we call the operation transitive. 


Proposition 151 


Orbits of a group operation are basically “cosets” in the sense that they partition S. 
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Proof. For any element s, s = 1s € Orbit(s). So orbits are not empty, and they cover the group, since every element 
is in an orbit. So, to complete the proof, we want to show that if two orbits Orbit(s) and Orbit(s’) have nonzero 
intersection, they are the same set. 

Equivalently, we can show that s’ € Orbit(s) = = Orbit(s’) = Orbit(s). This is really not hard; if s’ is in the orbit 
of S, then s’ = gs for some g. But then the orbit of s’ is {xs’ | x € G} = {xgs|x € G} and xg € G, so any element 


in the orbit of s’ is in the orbit of s. To get the other inclusion, repeat the argument but using s = g7!s’. 


Proposition 152 (First Counting Formula) 
Let S be a finite set, and let G operate on S. If the orbits are denoted O,,--- , Ox, then 


k 
IS] = 7 10%. 
i=1 


Definition 153 
Let G operate on S, and let s € S. Then the stabilizer of s, denoted Stab(s), is the set of group elements g € G 
such that gs = s. 


Stab(s) form a subgroup of G. This is not very hard to show - identity, closure, inverses are all satisfied. 


Example 154 
What's the stabilizer of a fixed triangle in the plane? 


It might seem like the only isometry is the identity, since three points fix a plane. But recall that we're acting on 
the set of triangles, so an equilateral triangle (for example) can have nontrivial rotations or reflections that fix the 
triangle, even if they don’t fix the vertices or edges themselves. So the group is the identity {1} only, {1, r}, or 


D3, depending on whether the triangle is scalene, isosceles, or equilateral (respectively). 


Example 155 
Let H be the stabilizer Stab(s), and let s’ € Orbit(s); this is the set of points where there exists g € G with 


gs = s’. What are all elements x € G that send s to s’? 


This is the coset gH. Indeed, if xs = 5s’ = gs, then g txs=s = g-!x € Stab(s). Thus, g'x€H = xe 
gH. In particular, this means that there is one coset of H for every element in the orbit of s, and that leads us to the 


following result: 


Proposition 156 (Second Counting Formula) 
Let G be a group acting on a set S. Then for any element s € S, |G| = |Orbit(s)||Stab(s)]. 


Recall the formula |G| = [G : H]|H| for a subgroup H of G. In this case, if H is Stab(s), then Orbit(s) is the index 
of H in G. 


Example 157 


SO3 is the set of rotations of R*. Take G to be the symmetries of a cube centered at the origin. 
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We can rotate in the Rubik's cube style by 5, around a space diagonal by an or around the center of an edge by 
am. So G is the stabilizer of the cube in SO3. 

A cube has 6 faces. Let’s look at the rotations that fix F, the front face. Then the size |Stab(F)| is 4, since 
there are four possible rotations of the front square. But the face can be sent to any of the 6 faces in general, so the 
counting formula tells us that there are 6 - 4 = 24 symmetries of a cube. Similarly, there are 8 vertices, and the order 
of the stabilizer of any vertex is 3. Also, there are 12 edges, and the order of the stabilizer of any edge is 2. Notice 
that 


6-4=8-3=12-.2= 24. 


As a final note, let H be a subgroup of the group G, and let C be the set of left cosets of H. Then G operates on 
C by the law g, C — gC (in other words, if C = aH, then gC = gaH). This is a transitive operation, and the stabilizer 


of the coset 1H is just H, so this formula is equivalent to the counting formula from earlier in class. 


17 October 19, 2018 


Today's topic is finite rotation groups. Recall that if a group G operates on a set S, then we define the orbit of s 
to be Orbit(s) = {s’ | s’ = gs} and the stabilizer of s to be {g|gs = s}. Orbits partition S, much like cosets do, and 


the order of G is always the products of the orders of the orbit and stabilizer of s for some element s. 


Corollary 158 


If s,s’ are in the same orbit, then the order of Stab(s) and Stab(s’) are the same. 


As a sidenote, let’s say H is the stabilizer of s, which is s’ = gs for some g. Then the stabilizer of s’ is the 


conjugate subgroup gHg7?. 


Theorem 159 
Let G be a finite subgroup of SO3, which are the rotations in R?. Then G must be C,, a cyclic group, Dp, a 


dihedral group, or T,O,/, the rotation groups of the tetrahedron, octahedron, and icosahedron, respectively. 


(We may ask how to get reflections in the dihedral group using rotations. Well, we can rotate in the third dimension, 
which gives us an extra degree of freedom.) 

Note that the cube and octahedron are duals: connecting centers of the cube faces gives an octahedron, and vice 
versa. This means that they have the same rotation groups! Similarly, the icosahedron and dodecahedron are also 
duals. 


To analyze these, we will look at the group operations on two sets. 


Definition 160 


The pole of g #1 € G isa unit vector along the axis of rotation. 


Every g € G that is not the identity has two poles (a unit vector and its negative). One nice way to say this is that 


pis a pole of g 41 if |p| =1 and gp=p. 


Lemma 161 


G operates on the set of poles P. In other words, gE G,pEeP = gpeEP. 
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Proof. \f p is a pole, then there exists some op in G such that p is the pole of o. In other words, pp = p. Now, if 
p’ = gp, then 
gog ‘p' = gp(g *p') = gop = gp =’, 


1 


so gog + fixes p’. And gpg™! is not the identity, because that would imply gpg~! = 1 go=g p=1. 


Thus p’ is indeed a pole of some element of G. 


With this, we can now prove Theorem 159: 


Proof. Since G operates on P, we can decompose P into orbits 
P=OQO,U---UOk. 


Let's say |O;| = nj, and let |G| = N > 1 (otherwise it’s the trivial group C,). If a pole p € Oj, then its stabilizer 
Stab(p) has order 7 = of depending only on / (which orbit it is in). But we also know that the number of poles is 
|P| = |O1| +--+ + |Ox|. (Notice that the stabilizer of a pole p is the set of rotations about that axis, which forms a 
cyclic group of order 1.) 

So now define the set S = {(g,p)|9#1€G,p pole of g}. (Such ordered pairs (g, p) are called spins for some 
reason.) Since every element differing from 1 has 2 poles, there are a total of 2(N — 1) poles. But there's another 
way to count S: consider a fixed pole p. Then there are |Stab(p)| — 1 = 7, — 1 such (g, p) that work, since we need 
g to be a nonidentity element in the stabilizer of p. Summing this over all p (which can be done by summing over all 
orbits) 


2N-2=|S|= > (i 1) =D (mil - 1) = DIN - 1) 


where the last equality comes from njr; = N. Now dividing by N, 


Notice that 2 — a < 2, while 1 — ; > $ if r; > 1 (and we know that r; 4 1 because any pole p must have a nontrivial 


rotation that stabilizes it by definition). So this means that / < 3. Time for casework! 


- If } = 1, the left hand side is > 1, while the right hand side is < 1. This can’t happen. 


- if i=2,then2—-2=1-441-4 =} +4424. But 17 is the order of a stabilizer, so r)|N, and we must 
L 1) ian rp 


therefore have ry = rf = N. So there are two poles, fixed by all g € G (since the orbits are trivial). This means 


they are opposite poles on a specific, and this leads to G = Cy. 


Now, if / = 3. we have 


N h fo ie fal fn £ N 


This is actually really hard to satisfy; if none of r; are 2, then the left hand side is less than 1. So there are only 


a few possibilities: the edge cases are 


WLOG let nh < fr < rz; we must have 1 = 2 and fm = 2 or 3. So we have (2, 2, r), (2,3, 3), (2,3, 4), (2,3, 5), 
which correspond to N = 2r,12, 24,60 and n; = (r,r,2), (6,4, 4), (12, 8, 6), (30, 20,12). These correspond to 
D,, T,O,1 respectively. In fact, for the tetrahedron, octahedron, and icosahedron, these actually correspond to 


edges, vertices, and faces! 
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Example 162 


Professor Artin brought a power line insulator. Its group of symmetries is D4, but the element of order 4 is not a 


pure rotation; it is orientation-reversing. 


18 October 22, 2018 


We'll now consider operations of a group on itself: what are some simple examples? First of all, the trivial operation 
sends (g,x) € G x G > gx. This is not very interesting. 
On the other hand, conjugation sends (g, x) to g*x = gxg™1. It’s good to check that this is indeed a valid group 


operation: we want 1* x = x and gy * (go * xX) = (9192) * x for all x € G. The first is easy, and 


1 * (go *X) = 91 * (9ox9z*) = gig2x95'97* = (9192)x(g192)°. 
So now that conjugation is a group operation, we can talk about orbits and stabilizers of this action. 


Definition 163 
The orbit of x € G is {x’ | x’ = gxg™!, 9 € G}, which is the set of elements conjugate to x. This is called the 


conjugacy class of x, and it’s often denoted C(x). 


Definition 164 


The stabilizer of x is all g such that gxg™! 


= x; that is, gx = xg. Thus, this is the set of g that commute with 


x, which is the centralizer Z7(x). 


Thus, the counting formula (that is, the orbit-stabilizer theorem) tells us that 
IG] = |C(x)IZ(x)| 
for any x € G. We also know that the orbits partition our group G: 


Theorem 165 (Class Equation) 


Let Cy, Co,--- , Cx be the conjugacy classes of a group G. Then |G| =|C1| +---+]|C;x|. 


For example, the only element conjugate to the identity is itself, and thus one of the |C;|s must be 1. 


Let's discuss a few more important things about each conjugacy class C(x): 
* x € C(x). 


C(x)| divides the order of the group |G]. Notice that this means |G| = |Ci| +---+ |C,| is a sum of divisors of 


|G|, one of which is 1. 
Similarly, we can talk about the centralizer: 


* x € Z(x). 
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Z(x)| divides the order of the group |G]. 


When is the order of a conjugacy class 1? This means that everything in the group commutes with it, and the 


order of the centralizer is |G]. 


Definition 166 


The center of a group G is the set Z7={x€G: gx =xgVgeEG}. 


Notice that in an abelian group, the class equation is useless, since the conjugation operation does nothing. In 
addition, the elements of Z are each in their own conjugacy class, so we get additional terms of 1 in the class equation. 
Finally, Z C Z(x) for any x. 


It is important here to state a quick fact: 


Lemma 167 


Conjugate elements in a group have the same order. 


Proof. If x’ = 1, then (gxg")' = gx'g°* = gg"* = 1. 


Example 168 


Consider Ds, the symmetries of a pentagon. Ds has order 10; what is its class equation? 


Let xX = Por/s and y = r (the standard reflection). Then the group is described by 


G={x*y? |e = 1,9 =1.x«v =v 1h 


Notice that x, x2, x3 


_x* all have order 5. The orders of the other nonidentity elements have order 2, since they are 
just reflections. By the above lemma, the conjugacy class of x must be a subset of Leese. x3, pase It doesn't have to 


be the whole set - in fact, it can’t be, because 4 Is not a divisor of 10. 


1 4 


We know that C(1) = {1} (the identity is conjugate to only itself). Notice that yx =x ty = = yxy t= xt. 
Thus, the conjugacy class of x contains x*, and therefore C(x) = {x,x*} (we can't add anything else or the order of 
C(x) wouldn't divide 10). Similarly, C(x?) = {x?, x3}. 

Now let's look at the other elements. In general, it’s easier to find the centralizer of an element than the 
conjugacy class, because it is a subgroup (and that gives it more structure). So let’s find the centralizer of y. Z(y) 
contains y, and since it’s a group, |(y)| = 2 divides |Z(y)|, which divides |G] = 10. But this means |Z(y)| must be 2 
or 10, and it’s not 10, since y is not in the center of G (for instance, yx # xy). Thus |Z(y)| = 2 and |C(y)| =5. 

This gives our class equation 


10=1+2+4+2+5. 


Since there's only one 1, this proves that the only element of the center in Ds is the identity. 


This kind of class equation calculation can be insightful in different ways: 


Definition 169 


A p-group is a group whose order is p’ for some r > 1 and prime p. 
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Proposition 170 


The center of a p-group is not just the trivial group {1}. 


Proof. The class equation of G is of the form 


p=1+5S°>C,=14+)  p%, 


where the cS are nonnegative integers. Remember that terms of 1 in the class equation correspond to elements of 


the center. If there is nothing else in the center besides the identity element, then all terms in the sum are multiples 


of p. This is a contradiction since 0 4 1 mod p. Thus, we must have some additional p® terms. 


Example 171 
Dg = {xy | x* = y* = 1, yx = x7ly} has 8 elements, and 8 is a prime power. Thus, its center is nontrivial. It 


turns out the center is {1, x}, since yx* = x~*y = x*y. We can check that the class equation turns out to be 


8=14-1724242. 


Example 172 
Let G be the set of 3 x 3 matrices of the form 


where a, b, c are integers mod p. 


The order of the group is |G| = p?. What is the center of G — that is, when is it true that 


qo fF 
oOo fF 
Se 
N 
a) 


1 y| |l1 a b 1 
0 z)| |O 1 c}] = 1]0 
0 1} ]O0 O 1 0 


oO rF x 


a 
1 
0 


eA 
(=) 
(>) 
he 


Doing out the matrix multiplication, this happens when 


X+ta=a+x 


b+x¢+y=y+az+b 


ZAC = CZ. 


Most of this is trivially satisfied: the only problem is that we need to have xc = az. Well, this can only hold for all 


1 0 b 
X,y,zZ If a= c=0. So the center has order p: it’s all matrices of the form |0 1 O}. 
00 1 


Theorem 173 


If |G| = p, then G is abelian. 
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Proof. We know that Z is not trivial by Proposition 170, and the order of the center |Z| divides |G| = p?, meaning it 
is either p or p?. 

If it is the latter, we are done: everything commutes with everything. Assume the former for contradiction. Then 
there exists some element x € G that is not in the center. Let H be the subgroup generated by |Z| and x. Since Z is 
a subgroup of H, |Z| = p divides |H|, which divides |G| = p?. In addition, |H| contains x, which is not in |Z|. Thus 
|H| = p?, and everything in the group G can be generated by Z and x. 

But x, Z C Z(x), so Z(x) = G Is the whole group. Thus, x commutes with all g € G. But this means x is in the 


center, contradiction. 


Corollary 174 


If |G| = p, |G| is either cyclic of order p* or the direct product of two cyclic groups. In other words, G = Ce oF 


Ca Gs: 


Proof. All elements of G that are not the identity have order p or p?. If some element x has order p?, the subgroup 


generated by x gives a cyclic group. Otherwise, all elements have order p. Take two of them, x, y, such that y is not 


a power of x. Then these generate the whole group, and we can check that this is isomorphic to Cp x Cp. 


19 October 24, 2018 


Today we're going to calculate some more class equations. First, recall the definition: for any x € G, let C(x) be the 
conjugacy class of x, which is the set of all elements x’ that can be written as gxg~* for some g € G. We also let 
Z(x) be the centralizer of x: the set of all elements g such that gx = xg. Recall that |G] = |C(x)||Z(x)| and that 
x € C(x), Z(x). Also, the center of the group Z is a subset of Z(x). 

Then the class equation says that |G] = 5+ C;, where C; are the distinct conjugacy classes. Since 1 is by itself in 
a conjugacy class, C; = 1 Is one of the terms in the sum. Also, elements of the center Z contribute 1s in the class 


equation. 


Example 175 


Let G = SL2(F3), the set of 2 x 2 matrices with entries € {—1, 0,1} and determinant 1. 


First of all, what's the order of SLo(F3)? We did this for homework; it's (p — 1)p(p +1) = 24 in this case. So 


we'll write 24 as a sum of conjugacy classes. 


First of all, +/ are in the center, so two of the terms are 1s. Only 22 elements to go! 


Fact 176 


The characteristic polynomial of a matrix x is the same as the characteristic polynomial of gxg~!. (This is because 


gxg™* 


is the same operator in a different basis, so the eigenvalues should stay the same.) 


Thus, all matrices in a conjugacy class will have the same characteristic polynomial. What are the possible 
characteristic polynomials? They will be of the form p(t) = t? — (tr x)t + (det x) = t? — trx +1. Then tr x, the 


trace, can be —1,0, or 1, and this limits us to three posssible characteristic polynomials: t?-+1,t?+t+1= (t iy, 
and t?—t+1=(t+1)?. 
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First of all, we find a matrix A with characteristic polynomial t? + 1. One example that works is 


0 -1 
A= : 
1 0 
To find the order of the conjugacy class of A, we wish to find its centralizer: that is, the set of matrices P such that 


PA=AP. Let P= ° 


b 
} then we're asking for 
c d 


cat alk ale: 


. : ; : F . a 
which is true if a= d, b= —c. So matrices in the centralizer are of the form P = 


b ; 
| , and to have determinant 
a 


1, we need a? + b? = 1 (mod 3). So one of a* and b? is 1 and the other is 0: the possibilities are {0, +1} and {+1, O}. 
This means there are 4 elements in Z(A), so |C(A)| = 6. 


1 1 
Next, we look at B = 4 = /+ 19. (This has a different trace, so it is in a different conjugacy class.) If P is 


in the centralizer of B, which means PB = BP, we need ej2B = Bez, which means c = 0,a = d. For this to have 
determinant 1, a = +1 and then there are 3 choices for b, so |Z(B)| = 6 and |C(B)| = 4. 

—-1 1 
0 +1 


Similarly, we can take B’ = 


| = —/+€19; a similar calculation yields another |Z(B’)| = 6 and |C(B’)| = 4. 


1 
Now we try B? = F . This has the same characteristic polynomial as B; are B and B’ in the same conjugacy 


class? In other words, we have to solve PBP~! = B’. Since B = +e), we can equivalently ask whether Pej> = €1P. 
0 b 
This happens if a = 0,b = c. But this leads to a matrix of the form i i} which can never have determinant 1. 


So B and B’ are not conjugates, and we do have a different conjugacy class. An identical calculation also gives 


|C(B")| =4, and similarly |C(B’’ )| = 4. Finally, we've arrived at our final class equation: 


24=1+1+64+44+44+4+4. 


Example 177 


Let G = Ss. This group has order 5! = 120. 


To understand conjugation in the symmetric, group let’s think about different ways to label indices. For example, 
let's say we have a permutation p = (123)(45) that we want to write in terms of the indices a, b,c, d, e. Then we can 
represent this as having a function f that sends 1 + a,2 > band so on. Then our permutation is 6 = fpf—'. This 


is a weird way to think about conjugation: f is just a way to think between languages; it’s sort of like a translation: 
B(b) = [Ff pf~*](b) = [Fp](2) = F(3) =. 


But now, what if f doesn’t send 1,2,3,4,5 to a,b,c,d,e-— it instead sends us to another permutation of the 
numbers? For example, what if f sends them to 5,3,2,1,4 respectively? Then f is a permutation as well, and 


pS fpr 
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Corollary 178 


The conjugacy class of a permutation p is all permutations that have the same cycle type. 


For example, p in the example above is conjugate with all permutations with a 3-cycle and a 2-cycle. But we have 
to be a bit careful. The permutation (123)(45) is conjugate with both (532)(14) and (325)(14), which correspond to 
different fs. However, those are the same permutation represented in different ways, so we don’t have a one-to-one 


correspondence. 


Example 179 


Let’s first find the class equation for Sq. 


The different types of cycles are the identity, 2-cycles, pairs of 2-cycles, 3-cycles, and 4-cycles. Well, there's 1 
identity, (5) = 6 2-cycles, 3 pairs of 2-cycles, (3) -2 = 8 3-cycles, and 3! = 6 4-cycles, and we easily have our class 
equation 

24=14+64+3+8+6. 


(The order in which we write the class equation doesn’t matter.) 


Example 180 


Now we do Ss — by the way, S¢ is for homework. 


We have 1-cycles, 2-cycles, pairs of 2-cycles, 3-cycles, a 3 and 2-cycle, a 4-cycle, and a 5-cycle. Call these C; 
through C7. |Ci| = 1; this is just the identity. There are |Co| = (C) = 10 2cycles. We can get pairs of 2-cycles in 
ICs] = 4(3) iy = 15 ways. We can just keep going at this point: |C4| = (3) -2 = 20 (pick three elements and then pick 
the order of the cycle), |Cs| = 20 as well, since we just put the remaining two indices in a 2-cycle, |Cg| = Cy -3! = 30, 
and |C7| = = = 24. This yields 

120 =14+104+15+20+4 20+ 304 24. 


Example 181 


Finally, let's find the class equation for Ag. 


Notice that each cycle type is either even or odd. In Sg, the identity, 3-cycles and pairs of transpositions are even 
permutations, so 12 = 1+3-+8. However, just because permutations are conjugates in the symmetric group doesn’t 
mean they are conjugates in the alternating group! In fact, 8 doesn’t even divide 12. It turns out that the class 
equation for Ag is 

12=14+3+4+4+4+4. 


How would we find something like this in general? We know that S, > Ap. If p is an even permutation € Ap, then it 
has conjugacy class in both groups: let it be C; in S, and Cy, in A,. But if we can conjugate by something in A,, we 
can do the same in Sp, so Cs D Cy. In other words, |Ca| < |Cs]. 

Similarly, if we have centralizers Za and Zs respectively, if an element commutes with p in A,, it'll commute in 
Sn. So Za C Zs, and since centralizers are groups, we actually have that |Z,| divides |Zs|. This is actually powerful: 
we know that |Cs||Zs| = |S,| = n! and |Ca||Z4] = |An| = q We only have a factor of 2 to work with, and Ca, Za 


are subsets of Cs, Zs respectively 
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Fact 182 


For any permutation in S,, there are only two possibilities: either |Ca| = |Cs| or eal 


. These correspond to a 


centralizer of |Z,| = gs and |Zs|, respectively. 


The case where |Ca| = 3|Cs| occurs when all g € S, such that gp = pq is even. On the other hand, |Ca| = |Cs| 
occurs if there is an odd permutation in Z, for p. 

So now, we return to Ss. The alternating group contains the identity, pairs of 2-cycles, 3-cycles, and 5-cycles. 
There are 15 pairs of 2-cycles, so we must have |Ca| = 15. (Alternatively, (12), an odd permutation, commutes with 
(12)(34).) For the 3-cycles, (45) commutes with (123), so this stays at 20. Finally, there are 24 5-cycles, and 24 


does not divide 60, so this must split in half. This gives a class equation for As: 


20 October 26, 2018 


Today we will be talking about the icosahedral group /, the set of rotational symmetries of an icosahedron or 


dodecahedron (which are dual). 


Fact 183 


The dodecahedron has 12 faces, 20 vertices, and 30 edges. All of the faces are regular pentagons. 


We're going to return to the idea of a spin: given a rotation by @ around some axis, we can draw a unit vector p 


along the axis. That ordered pair (g, p) is called a spin. By the way, recall that 


P@,p = P—6,—p- 


Basically, a rotation that rotates a face by +0 also rotates the opposite face by —@. Call faces f, vertices v, edges e. 
Let’s try to count the elements of our group. As always, we have 1 identity, and every other symmetry can be 
described in one of several ways. 
Each face has 4 nontrivial face rotations, and there are 12 faces, but each one is counted twice, so we have 


+2 = 24 face rotations. Each vertex is connected to 3 faces, so here we have 230 = 20 vertex rotations. Finally, 


130 = 15 edge rotations. The total order is 


we have edge rotations: each edge can be rotated by 7, so we have 
14+24+20+15 = 60. 

We can also look at orbits and stabilizers to count the order of /. All the faces form an orbit, since we can rotate 
around vertices. There are 12 faces and 5 elements that fix each face. Similarly, there are 20 vertices and 3 elements 


that fix each vertex, and 30 edges and 2 elements that fix each edge, so 


60 = 12-5=20-3= 30-2. 


But now compute the class equation of /, and to do that, let’s try to visualize conjugation. Recall that if we want 
to rotate around q by an angle 0, 


06.q = 9PopI 


where g is a rotation that sends p to q. In other words, rotations of the same angle are always in the same conjugacy 


class! 
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So rotations by oa are conjugate to each other, and also to those rotations of — 20 In total, this gives 12 rotations 
by au (avoiding double counting). There's another conjugacy class of 12 rotations of a and a conjugacy class of 20 


rotations about vertices. Finally, there are 15 rotations by a around edges, for a class equation of 


What’s the use of a class equation? We can use them to analyze normal subgroups! Recall that a subgroup N of 
a group G is normal if for all x € N,g € G, we have gxg 1 EN. 

Then AN is a union of conjugacy classes, including the class of 1 (the identity). Furthermore, |N| divides |G|. This is 
pretty powerful: let’s say we want a normal subgroup of /, the icoshedral group. We need to add some of 12,12, 20,15 


to 1 to get a divisor of 60: the only way to do this Is to get the trivial subgroup or the whole group! 


Corollary 184 


The icosahedral group / has no proper normal subgroups except for the trivial subgroup (only containing the 


identity). 


This means / is a simple group, meaning there are no nontrivial proper normal subgroups. For example, groups of 
prime order are simple, but those are not interesting. 
Recall the class equation for S,4: 
24=1+6+3+4+8+6. 


The normal subgroups can potentially be formed from 1,1+3,1+3+8,1+6+3+8+6. These correspond to the 
identity, pairs of transpositions, alternating group, and the whole group. In this case, these all happen to be normal 


subgroups. 


Theorem 185 


The icosahedral group / is isomorphic to As. 


Proof. To prove this, we find a way to have / operate on 5 things. It turns out there are 5 cubes that are inscribed 
in the dodecahedron! Then this gives us a homomorphism / 2 Ss, because each symmetry sends the 5 cubes to a 
different permutation. The kernel is a normal subgroup of /, and there are no proper normal subgroups, so it must be 
trivial or the whole group. If the kernel was the whole group, then everything does nothing to the five cubes, which is 


not true. So @¢ is injective, and / maps isomorphically to some subgroup H of Ss. 


Now consider the sign map that sends Ss — {+1}. Where does H go? The image is either {1} or {+1}. If 


H > {+1}, then the kernel of S has order $|H| = 30. But this means there is a corresponding normal subgroup of 


order 30 in / (by the correspondence theorem), which is a contradiction. Thus, this sign map must only map H to {1}, 


so H is in the kernel of the sign map, which is As. Indeed, both have order 60, and we’ve shown our isomorphism. 


Corollary 186 


The alternating group As is simple. 


Theorem 187 


An is simple for all n> 5. 
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Notice that Aq is not simple: it has the normal subgroup of order 4. 


Outline of a proof. Let N be a nontrivial normal subgroup of A,: we must show that N = A,. If N is normal in G, 
then it is closed under multiplication, inversion, and conjugation. In particular, the commutator gxg~tx7! € N is in 
N. The idea is that if the group is noncommutative enough, this gives us a lot of elements in N. 

First, show that the 3-cycles generate A,. Then, we show that for n > 5, the 3-cycles form one conjugacy class. 


Now, to show N = A,, we can take x € N and g € A, and find a 3-cycle gxg~tx7! EN. Then if one 3-cycle is in N, 


then all of them are, since the 3-cycles are in one conjugacy class. Then those generate the whole group. 


Let’s do an example: suppose that N contains the element x = (12345)---. Letting g = (123), 
gxg~*x~* = (123)[(12345) - - -][(132)][--- (15432)] = (124)(3)(5)(---) 


and the --- cancel and we can ignore them; we've found a 3-cycle in N. 


21 October 29, 2018 


Today, we're going to talk about the three Sylow theorems, which are probably the first important general structure 
theorems (except for maybe some results that Galois proved). We'll state them and do some applications today, and 


we'll do the proofs and some other things on Wednesday. 


Theorem 188 (Sylow Theorem 1) 


Let G be a finite group of order n, and let p be a prime dividing n such that n = p©m and p{ m. (In other words, 


p® is the largest power of p dividing n.) Then there always exists a Sylow p-subgroup, which is a subgroup of G 
that has order p*. 


As a corollary, there is always an element of order p, since we can take a nontrivial element of a Sylow p-group 
and look at the cyclic group generated by it. 


Let's see what we can do with this. 


Example 189 


Let |G| = 6. What can we say about the group from the first Sylow theorem? 


Solution. By above, there exists an element x of order 3, and there also exists an element y of order 2. So we have 
the six elements 


G= {1,x,x?, y, xy, x°y} 


We can see that xy is not equal to 1, x, x?, y by cancellation, and similarly that x2y is also distinct from the others. 
Thus all six of the elements in this set are distinct, so this is actually our group G. 


Let K be the Sylow 3-subgroup generated by x. It has index 2 in G, since G= KUykK. 


Fact 190 


Any subgroup of order 2 is normal, so K is normal. 


(This is because G = K UyK = KU Ky. Therefore yK = Ky, and all left and right cosets are equal.) As a result, 


yxy? must be an element of K. It’s not 1, or we would get that x = 1, so either yxy~! = x or x”. These two 
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possibilities must then correspond to the two non-isomorphic groups of order 6 that we know: the first case gives us 


the group law for Cg, and the second case gives us the group law for S3 = D3. Every group of order 6 is therefore 


isomorphic to either Cg or S3. 


Notice that this isn’t the only way to represent Ce. We actually did it as a product group C3 x Co here: 


Definition 191 
Given two groups H, K, the product group Hx K has group elements that are the product set (h,k),h € H,k € Kk, 


with multiplication componentwise: 


(h, k)- (A, k’) = (Al, kk’) 


Lemma 192 (Chinese Remainder Theorem) 


For any n= ab, where gcd(a, b) = 1, C, is isomorphic to Cz x Cp. 


Proof. Since a and b are relatively prime, there exist r,s € Z such that ra+sb= 1. Let’s say x € C, generates the 
group: that is, it has order n. Let u = x? and let v = x: then u has order b and v has order a. 


Let y generate C, and z generate Cy. Now consider the map 
Ca X Ch > Cy: (y,1) 9 v,(1,2) > u 


This sends (y!, z/) > v'w = x?!) and picking the relevant i and j to make ai + bj = 1, we can get x? out of this 


map, sO we can generate all of C,. Therefore the map is surjective, and since C, and Cz x Cp have the same finite 


order, the map must be injective as well (and we have an isomorphism). 


Theorem 193 (Sylow Theorem 2) 
All Sylow p-subgroups are conjugate: if H, H' are two Sylow p-subgroups, then h = gHg~! for some g. 


Theorem 194 (Sylow Theorem 3) 


The number of Sylow p-subgroups divides m = = and is congruent to 1 mod p. 


This last result is really useful for looking at groups of some fixed order. 


Fact 195 


If there is only one Sylow p-subgroup H, then it is normal. 


This is because it is equal to all of its conjugates, so the left and right cosets are indeed equal. 


Example 196 


What can we say about the groups of order 10 = 2-5? 


Solution. The number of Sylow 5-groups divides 2 and is 1 mod 5, so it is 1. Thus K, the subgroup of order 5, is 


normal, and it is generated by some x with order 5. 
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By the way, we know that the number of Sylow 5-groups divides 5 and is 1 mod 2, so it is either 1 or 5, but we 
won't use this. Either way, there exists an element y of order 2 that generates one of the Sylow 2-groups. So we can 
write the elements out as 

G= £1, x, x2, x3, x4, y, xy, x2y, xy, x4y}. 


Since K = (x) is normal, yxy} 


=x" => yx = x’y for some r (since we must stay within the normal subgroupK 
after conjugating by y). We also know that x° = y? = 1. 
Does such a group exist for all r? Technically yes, but if we put bad relations on a group, it might collapse. For 


example, let's try yx = xy. After some manipulation, 


2 4,2 4 3 


x = y?x = yx*y = xty? =x ~=1=*x5 


so xX = 1, which means we don't have a group of order 10 at all. Moral of the story: relations are always fine, but we 


might collapse the group. Well, let’s try this calculation with other r: 


2 r?, 2 r? 


X=Yx=yx'y=x" y=x 


so if we want x to be nontrivial, r2 = 1 mod 5, which implies r = 1,4. Thus, there are two isomorphic classes of 
groups of order 10. And we know what they are: yx = xy corresponds to Cig = Cs X Co and yx = x*y corresponds 
to Ds. 


Example 197 


What are the groups of order 15? 


Proof. The number of Sylow 5-groups divides 3 and is congruent to 1 mod 5, so it is again 1. The number of Sylow 
5-groups divides 3 and is congruent to 1 mod 3, so this is also 1. So both subgroups are normal; call them K and H 
respectively. 

What's the intersection of H and K? HM K is a subgroup of both H and K, so the size of the intersection must 
divide both 3 and 5, so it must be 1. 

We claim that G is isomorphic to C3 x C5 = Cys. In other words, there is only one isomorphism class of groups of 


order 15. To justify this, we need the following result: 


Lemma 198 
Let H and K be subgroups of G. 


- If K is normal, then HK = KH is a subgroup of G. 


- If both are normal and HM K = {1}, the elements of H commute with the elements of K, so hk = kh. 


+ Finally, if both are normal, HM K = {1}, and HK =G, then G is isomorphic to H x K. 


Proof. Let's do these one by one. If K is normal, then hk = (hkh-*)h € KH. So HK C KH, and we can analogously 
show that KH C HK, so HK = KH. This is a subgroup of G because (checking closure) HKHK = H(KH)K = 
H(HK)K = HHKK = HK. 

For the second one, we show that hkh~'k~+ = 1. It’s equal to (hkh~*)k € K, and it’s equal to h(kh-+k~+) EH. 
Therefore, the element hkh~!k7—! in the intersection HM K, so this commutator must be the identity, and therefore 
hk = kh. 
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Finally, we know that H and K still commute in the third case, and the map H x K — G which sends h, k + hk 


is a homomorphism. The kernel of this map is 1, since hk = 1 h = k =1 (we can't have them be non-identity 


inverses if H and K have only trivial intersection). Therefore, the image is HK = G, so this is an isomorphism and G 


is isomorphic to H x K. 


And using this with our groups C3 and Cs, we have now showed that G = Cs x C3 = C45 Is the only group of orde 


= 


15. 


22 October 31, 2018 


We're going to prove the Sylow theorems today. Here they are together: 


Theorem 199 (All three Sylow Theorems) 


Let G be a finite group with order n = p*m for a prime p (where pf m). 


- There exists a subgroup of order p°, called a Sylow p-subgroup. 
+ All Sylow p-subgroups are conjugates. 


+ The number of Sylow p-subgroups is 1 mod p, and it always divides m. 


Proof of the first Sylow theorem. Let S be the set of all subsets of G of order p©. The size of S is 


isi= (22) = nl _ n(n =1)--- (n= pe +1) 


pel(n — p®)! Ts Ovsnpe 
This is not divisible by p: we'll accept this for now, but we can verify by checking that the powers of p match up in 
the numerator and denominator. 

Now G operates on S by sending any U € S to gx U = gU. Since |S| is not divisible by p and is partitioned into 
orbits, some orbit has order not divisible by p: say that it is Orbit(U). We claim that this orbit covers G evenly. 
More specifically, say that the element 1 is in all of U,,--- ,U,, where the U; € S are all in the orbit of U. Then 
g € gUy,--- , gUx, which are all also in the orbits, so 1 and g are in the same number of orbits of U. 


Thus, every element is in the same number of orbits: let's say the group G is covered k times. So 
|U||Orbit(U)| = k|G] = > p®|Orbit(U)| = kp? m, 


so the order of the orbit of U is km. But the order of the orbit must divide |G| = p®m, meaning k must be a power 
of p. On the other hand, we chose the orbit’s order to not be divisible by p, so we must have k = 1. So the orbit 
actually only covers G once, and the orbit of U has order m. Then the orbit-stabilizer counting formula tells us the 


stabilizer of U has order p*. 


But the stabilizer is a group! So we have indeed found a Sylow p-subgroup, as desired. 


We'll take a break now to study some more structure of Sylow p-subgroups: 


Example 200 


We found last time that groups of order 10 are either Cy) or Ds, and groups of order 15 must be isomorphic to 


Ci5. What about groups of order 30? 
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Solution. The number of Sylow 5-subgroups divides 6 and is 1 mod 5, so it is either 1 or 6. The number of Sylow 
3-subgroups divides 10 and is 1 mod 3, so it must be 1 or 10. If there are 6 Sylow 5-subgroups, they are all cyclic 
and don't intersect, each contributes 4 elements of order 5, so we have 24 elements of order 5. On the other hand, 
if there are 10 Sylow 3-subgroups, we have 20 elements of order 3. So we either have a normal Sylow 5-subgroup H 


or a normal Sylow 3-subgroup K. 


Lemma 201 


If H and K are subgroups, and at least one is normal, then HK is a group. 


(This is just verification of closure and the other axioms.) H has order 5 and K has order 3, and they (again) must 
have no intersection, so HK has order 15. Thus, G contains a subgroup Cys; let x be one of its elements with order 
15. 

We know there exists a Sylow 2-subgroup, so there is an element of order 2; let it be y. Then (x) has order 15 and 
index 2 in G, and index 2 subgroups are normal. So yxy~? is still in (x), which means yx = x'y for some 1 < r < 15, 
and also 


1 r? 


x=yly?=yxyt=x 


so r2 = 1 mod 15, which gives r= +1, +4. 

Each of these corresponds to a different group {x!° = y? = 1, yx = x’y}, and it is unique because we can write 
out a multiplication table just based on these properties of x and y. We just need to check that these actually give a 
group of order 30 without collapsing, and it turns out there are indeed 4 groups of order 30: C39, Dis, C3 x Ds, and 
Cs x D3. The last two groups have centers of C3 and Cs respectively, so they are distinct. So we've found all groups 
of order 30! 


Proof of the second Sylow theorem. Let H be a Sylow p-subgroup, and let C be the set of cosets of H. The group G 
operates on the cosets via g * aH = gaH. There is only one orbit — the orbit of 1-H — and the stabilizer of 1-H has 
order |H| = p®, so by the orbit-stabilizer theorem, the orbit of 1H has order m, which is not divisible by p. 

Now let H’ be another Sylow p-subgroup; we wish to show it is conjugate to H. Restrict the operation of G on C 
to H’, so we can only multiply cosets by elements of H’. 

Decompose C into H’-orbits. p does not divide m, the order of Orbit(1- H). 


Lemma 202 (Fixed Point Theorem) 


Let H be a p-group that operates on a set S. If p does not divide |S], there exists a fixed point s such that hs = s 
for all h € H. 


Proof. Partition S into orbits. Since |H| is a power of p and the orbits’ orders divide |H], |S| is a sum of powers of p. 


But |S| does not divide p, so we must have a 1 term in there somewhere: thus there is an element with orbit of 1, 


making it a fixed point. 


Thus, the operation of H’ on cosets of H has a fixed point; let’s say gH is fixed by all elements of H’. So H’ is 
contained in the stabilizer of gH, which is gHg7. 
This means that H’ c gHg™!, but both have order p®. Thus H’ = gHg™!, and the two subgroups are indeed 


conjugate. 


Remark 203. We didn’t get to the proof of the third Sylow theorem, but the main ideas are as follows: since all 
p-subgroups are conjugate, we can write them as gHg~' for g in some subgroup K of G. Now H is a subgroup of K, 
so [G: H] = mis a multiple of [G : Kk]. 


54 


To show that the number of groups is 1 mod p, we decompose into orbits under conjugation by H. Then H ts its 


own orbit, but all other orbits have order dividing |H| = p®, so the total sum is 1 mod p as desired. 
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Recall that S3, the symmetric group on 3 elements, can be represented in the form 
{(x,y) |x? = Ly? = 1 yxyx = 1} 


Things like x?, y?, yxyx are called relations. Today's class is about how we can use the Todd-Coxeter algorithm to 
see how the group operates on cosets of a subgroup! 
First, we choose a subgroup H of G and write it by using words in x, y that are generators for H. For example, 


we can take H = (y). This has order 2, so there will be 3 cosets. 


Theorem 204 (Todd-Coxeter Algorithm) 


Here are some rules for operating on cosets of H: this generates a unique correct table. 


« The relations operate trivially, since they're equal to 1. 


+ The operation of a generator (in this case, x, or y) is a permutation of the cosets. 


+ The generators of H fix the cosets of H. 


We’re going to work with right cosets so that we don’t have to read backwards. 


We can start off by writing the relations out like this: 


Cosets aren't uniquely represented by elements, so we'll denote them by indices instead. In this example, we'll let 


“1” denote the coset H- 1. 
We don't know what x does to H, but we know x? sends it to 1, and so do the other relations: 


x xX xX y y y x y x 


But we also know that y generates H, so it sends 1 to 1 as well: 


x xX xX y y y x y xX 


We don’t know what x does to 1, so we'll say 1 goes to 2, and then we'll say 2 goes to 3, so we know 3 goes to 1: 


x y y y xX y xX 


Well, now let's try to figure out what happens to the cosets represented by 2 and 3: we can already fill out the 


remainder of the table just given what we know. 
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FP W NY/|X 
NO FP WwW! xXx 
WO NY FX 
NO Wrei< 
WN FIX 
NO WwW rRIi< 
WO rR NX 
OF Wl< 
WN FX 


And now we notice (y) has index 3 (because there are 3 cosets which are only sent to each other under x and y). 
Now the order of x is 3 (since it operates nontrivially on the cosets) and the order of y is 2 (it operates nontrivially 
on 3), so |G| = 6, as we probably already knew. 

And we can write out the operations on the cosets: x sends (123) and y sends (23). And this is exactly the 


standard representation for S3 that we're used to! 


Example 205 


Let's apply the Todd-Coxeter algorithm to the tetrahedral group. 


Qn 


|, about a vertex by on 


In the tetrahedral group, we can rotate around a face by 3, or around an edge by 7. 
Call these x, y, z respectively. If we pick the vertex to be one of the vertices of the face, and the edge to be the 


counterclockwise edge from that vertex along that face, we have the following relations: 


(By the way, this tells us xy = z~? 


Are these relations enough to determine the group? Let's use Todd-Coxeter! We'll make the calculation easier by 
using generators x, y and having relations x3, y?, xyz = xyxy. 


We'll use the subgroup H = (x), so 1 stands for the coset of H. In particular, this means x does nothing to 1. 


x x xfooy yy xX y X Y 


x 
ii a 4/4 i 


We don't know what y does to 1, so we'll say that 1 sends it to 2 and then to 3. 


| yf 


x xX xX y y 
a ee ee a ee ee 


Now what happens to 2? We know y sends 2 to 3 and then 3 to 1 from above: 


| x 


y y y y xX y 
11 1/1 2 3 41 1 2 3 1 
2 2/2 3 1 2/2 2 
and reading off the top row, we also know that x sends 2 to 3. 
x x x | y y y | x y x y 
1 1 Ie 2 2S! idl 1 2 ST. 
2 3 44 2/2 3 12/2 3 1 1 2 


(It’s also possible right now that x is one of 1,2, 3; we don't know. For now, though let’s put a 4 there. If we find 
1 = 4 or something, that’s fine, and we can correct it later) 


Now we can fill in the action for the cosets 3 and 4, since we know that x cycles 2, 3, 4: 
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NO FB W KF | xX 
Wn FS |X 
BR WN |X 


BR WN 

BR WN 

BrP WNIX< 
BNF Wl|< 
FWNEHIXK 
BR WN 

NO BB W KF | xX 
oO Fr NIX 
BM FP WX 
FWNHIX< 


So there are 4 cosets, so (x) has order 4. We know that x? = 1 and it operates on a 3-cycle, so x has order 3, 
so the order of the group is 4-3 = 12. This is indeed the order of the tetrahedral group, so the representation above 


with x and y Is a valid description. 


Remark 206. Alternatively, we could have written out all possibilities for x and y. But this isn't commutative, so it’s 


not enough to write any element as x?y?. 


So what do we know about what x and y do to the cosets? x = (234) and y = (123); these generate Ax, so the 
tetrahedral group is isomorphic to Ag. 

As an interesting exercise, let's see what happens to xyxy. Remember that we're operating on right cosets, so 
xyxy means to first apply x, then y, then x, then y: composing the permutations (234), (123), (234), (123) in that 
order indeed yields the identity permutation, as we expect. 


Time to use a “bad set of relations” to see what happens with Todd-Coxeter: 


Example 207 


Take x?, y?, yxyxy as relations this time: What happens to the Todd-Coxeter algorithm? 


Let's take H to be the subgroup generated by x. Let's say y sends 1 to 2, so y sends 2 to 1. 


y xX Y X Y 
1 2 2 


x x x | 
1 


y y| 
a1 4ac2 a 


But x sends 2 and 1 to 1, so 1 = 2. So (x) has index 1 (since we could have used this argument for any coset). 


This means (x) = G, so y = x’ is in the group generated by x. Then yxyxy = x3"? = 1, so x? =1 =} x=1 and 
the whole group is trivial. 


This algorithm is a deterministic way to determine our group, as long as the order is finite! 
Remark 208. The only problem is the name: we should list names alphabetically in mathematics. 


Let’s turn our attention away from those relations for a second and speak more generally: 


Definition 209 


A word is any string of letters in an alphabet, along with their inverses. 


2 1 


For example, yxy~+x? is a word. The free group is basically just the set of words, but if we have x and x~+ next 


to each other, they should cancel. 
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Definition 210 


A reduced word is a word with no possible cancellation: 


There's two problems now: we could end up with the empty word, which we can call 1. Also, we can cancel 
adjacent terms in different orders: in particular, if we had something like xyy~*x 1x, the end result could either be 


the first x or the last x. Luckily, the reduced word obtained by cancellation is unique, and it’s not hard to prove this. 


Definition 211 


The free group is the set of reduced words, where the law of composition Is juxtaposition and then cancellation. 


For instance, 


(xyx—?)(xy~tx) = xyx i xy tx = x?. 


Any subgroup of a free group aside from the trivial group Is infinite, but we can mod out by a normal subgroup 
using relations. 
There is a hard problem called the word problem. Given a bunch of generators and relations, we can ask whether 


a word is equal to 1. But it turns out there is no way to predict this with a computer program (it is undecidable)! 
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Recall that we can define a group G by specifying a set of generators x,y,--- and relations ™,--- ,f%, which are 


1 and so on. 


words consisting of x, x7, y, y~ 
We basically started with the free group F on generators: the elements are all reduced words using those generators 
(and their inverses), and then the group G is defined to be F/N, where N is the smallest normal subgroup that contains 


those relations. 


Fact 212 


Ifa relation r = 1 is true in G, then grg~+ should also be true in G. 


So the normal subgroup NV is generated by all conjugates of relations 
{grg-+| 9 €F,r relation}. 
This is unfortunately very hard to use, even though it’s easy to define. 


Example 213 


ileal 


Let the generators be x, y, and let’s say xy = yx. Then xyx-*y~~ is the only relation. 


So G is generated by x and y, and they commute, so it’s just going to be isomorphic to Z?, since all elements can 
be written as xy”. G is called the free abelian group. 

Notice that x*y = yx?, so something like x*yx~?y is in the normal subgroup N that we're modding out by. 
Specifically, we want the normal subgroup N generated by g(xyx-ty~tx~1)g~?, where g is any element in the free 
group. 


Well, it’s pretty annoying to try to find x?yx~?y explicitly in N: one way is to do 


SOG. VR ys oe Vx 
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but It’s a mess and not very important. 


Example 214 


Let's do another example of the Todd-Coxeter algorithm, this time with xyx = yxy. This is known as the braid 


relation (we'll add other relations in a second). 


The idea here is to imagine having three strands that we try to braid together. Then x swaps the first and second 
(with the left one on top) and y swaps the second and third. (For example, x? twists the braid twice.) The inverses 
x! and y~? also exist (we just swap so that the right is on top instead). 


Here’s a picture to explain why xyx = yxy: it can be checked that these two braids are equivalent. 


) } 


L 
KYX = yxy 


To make it a finite group, let's also say that x? = y? = 1. Then the braid relation becomes xyxyx~ty = 1: 


x x | ee a eg 


x 
1 1 


y y 
1 


1 1 


Take H = (x). Recall the rules for operating on cosets: generators operate as permutations, relations must be the 
identity permutation (operate trivially), and generators for H also fix the coset H. We're working with right cosets! 


So this means applying x sends 1 to 1: 


x xX 


x | y =o 
11 1 é%d41 


1 1} 1 


We don't know what y does to 1: let's say it sends 1 to 2. Then 2 must go back to 1. 


xX xX xX y y x Yy xX Y xX y 
111 é121/1 2 1/;1 «1 =2 1 
2 2 2 1 1 2 


We don't know what happens to 2 under x, so let’s say x cycles 2, 3, 4. 


xX xX xX y y xX y xX y xX y 
111 1/1 2 1/1 21 2 3 4 2 ~«1 
2 3 4 2/2 1 2);2 3 4 2 «21 1 2 


But now x~! sends 4 to 2, so x sends 2 to 4. Thus 3 = 4, and that means 2 and 3 are both sent to 3, so 2 =3 


as well. So now our relation just looks like 


x x xX y y | x y x y xtoy 
11 %1é321);1 2 171 2 2 2 2 2 1 
Dg 2 2) 2 i. 22.) 2 2s Be Ds, 1 2 


and now y sends both 1 and 2 to 2, so 1 = 2. So there’s only one coset, meaning that (x) = G! But now a little 


2 


more analysis collapses the group further: we see that y must be x“ and has order 2, so y = 1. This means x? = x, 


so X = 1 and we have the trivial group. 


Proposition 215 


If the Todd-Coxeter table is consistent, we have a correct operation on cosets. 


Proof. We will show that we have a bijective map @ from /, the set of coset indices, to C, the set of cosets, which is 
compatible with the operations of the group. To do this, we first write out what the algorithm actually makes us do. 

Let's say we're at some stage, and we have some indices /*, as well as some operations of generators on /*. When 
we fill in the table, one of two things always happens: we equate two indices if the rules tell us it is necessary, or we 
add new indices / such that 1g = / for some generator g and some existing index /. 

We start with /* = {1 = H1}, and we stop when we have a consistent table. So our map ¢ will send 1 to H- 1. 
Whenever we equate new indices, we're saying they act the same under all rules, so this is consistent. On the other 
hand, adding new indices is definitely fine: if / ~ Ha in our map @, then we just send / — Hag. So we will always 
have a map / x C which is compatible with the operations. 

To finish, we need to show that this map is bijective. Every index is in the orbit of 1, and the operation is surjective, 
so this map is surjective. Now, if ¢i = @j, then we know that Ha = Hb =  H = Hba™1, which means that ba! is 


an element of H. But now if ¢ takes / to Hb, then ja! — Hba~! = H, meaning that ja~+ = 1. On the other hand, 
1 


i — Ha, so ia~* = 1. This means / = /, and we've shown injectivity. 


25 November 9, 2018 


Quiz scores will probably be posted by tomorrow. There are many quizzes, so we won't get them back until Wednesday. 


Definition 216 


A symmetric form on a real vector space V is a function 


VxV>R:vw- (v,w) 


which satisfies the following properties: 
+ Symmetry: (v, w) = (w, v) 


+ Linearity in the second variable: (v, wc) = (v,w)c and (v, Ww + Wo) = (v, m1) + (Vv, We). Notice this also 


implies linearity in the first variable by symmetry. 


(A motivating example of this is the standard dot product.) We can also define the matrix of a symmetric form 


with respect to a basis: 


Definition 217 


If we have a basis B = (v4,--- , V;), then the matrix of the symmetric form is the matrix A = (a;j), where 


aij = (Vi, Vj). 
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Then if X and Y are coordinate vectors of v and w, then 
v=BX,w= By, 
and we can compute the symmetric form by linearity: 
(v,w) = 2 ViXi, S- vi) 
= xiv wy 
ij 
ij 
= XT AY. 


Notice that if A is the identity matrix, this gives the standard dot product with respect to B. It's interesting to 
think about what the standard dot product looks like with respect to other bases, but we'll answer that a bit later. 


Recall that the standard dot product can be rewritten as 
KIX => x7 = KP, 


and it is always nonnegative (and only zero when X = 0). This prompts the following definition: 


Definition 218 


A symmetric real matrix (and corresponding form) is positive definite if (v, v) > 0 for all v £0. 


It turns out that this doesn’t depend on the basis, and if a form is positive definite, it is the dot product for some 


basis! Now, let’s extend the idea of a symmetric form to complex numbers. 


Definition 219 


The standard Hermitian form on C” is defined via 
(XY) =X = xy bb Xn 


where X, Y are complex vectors (and Xz denotes the complex conjugate of x;). 


In particular, for any vector, 
(XX) = epg eos ee =X, 


since for any complex number x = a+ bi, Xx = a* + b? = |x|?. This means that standard Hermitian forms are positive 
definite. 


Definition 220 
= 


The adjoint of a matrix A is the (conjugate transpose) matrix A* =A . 


For example, 


This has the following rules: | A** = A], and since A, complex conjugation, is an automorphism of C, AB = A- B. 
So we also have | (AB)* = B*A* |, analogous to how (AB)! = BTA’. 
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Definition 221 


A Hermitian form on a complex vector space V is a function 
VxV>C:vw- (v,w) 


which satisfies the following properties: 


+ Hermitian symmetry: (w, v) = (v, w) for all v,w € V. 


+ Linearity in the second variable: (vy, wy + Wo) = (V1, W1) + (v2, We) and (v,wc) = (v,w)c. Hermitian 
symmetry gives us something a little more complicated than the real case: addition still holds as normal, 


but this time we have (cv, w) = (w, cv) = (w, v)¢ = (v, wc 


As a check, for the standard Hermitian form, 
(YX) =Y¥*X = (V*X)™ = (X*Y)* = X*Y = (XY) 


where the second-to-last step comes from X*Y being a 1 by 1 matrix and therefore being equal to its own transpose. 

The matrix of a Hermitian form has the same definition as before: we let Aj = (vj, vj). Again, how do we 
compute the form explicitly? Let's say X and Y are coordinate vectors for two vectors v, w in some basis B: then 
v = BX, w = BY just like before, and now 


(v,w) = (> ViXir > vy) 
= mv, Yi)¥i 
= X*AY. 


As stated before, if a form is positive definite, it’s a dot product in some basis. Say we have two bases B = 
(Vi,°°+ Vn) and B’ = (vj,---,v4). Then we can relate these via a basechange matrix B’ = BP: with this definition, 
if v= BX, Vv! = B’X’, then PX’ = X. In our old basis, (v, w) = X*AY, and in our new basis, (v, w) =| X"A’Y" |, so 


we have 


X*AY = (PX')*A(PY') =| X"(P*AP)Y' | ==> |A’ = P*AP | 


Remember that a linear operator can be associated with a matrix (in a given basis). In a Hermitian form, we 
also have a matrix, but these two matrices are not the same! It may seem like a symmetric form is somehow like 
a linear operator: this is not true, because linear transformations change to P~tAP, while Hermitian forms change 
to P*AP. Notice that this definition for forms preserves the Hermitian property: (P*AP)* = P*A*P = P*AP since 
A =A. 


Lemma 222 


A form is symmetric/Hermitian (for the real/complex case) if and only if A* = A. (In the real case, A* = A.) 


Proof. For any Hermitian form A, 


Y*AX = (X*AY)* = Y*A*X* = Y*ANX, 


so for this to be true for all X,Y, we must have A* = A. 
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Theorem 223 


Given any Hermitian form, there exists a basis such that the matrix for the form is diagonal. 


We won't do the proof explicitly here, but we know that A must have real entries on the diagonal, and then we'll 
have conjugates aj, and aj; off the diagonal. In the 2 x 2 case, if the top left entry is nonzero, we can do a column 
operation to clear out the top right entry (and get the bottom left for free). A similar argument of row and column 


operations helps us prove this in general. 


Corollary 224 


A Hermitian form is positive definite if and only if there exists a basis so that the matrix is diagonal with positive 


entries. 


This is because we can write X*AX = 1X xX, +--+ + MXpXp, and this quantity is always positive for nonzero x if 


and only if r; > 0 for all /. 


Remark 225. We can normalize this diagonal matrix by dividing the ith row by ,/r;. But we can only do this if r;s are 


all positive, so this isn’t recommended in general. 


26 November 14, 2018 


We'll start with a bit of review. Recall that a Hermitian form is a function (v,w) — C which is linear in w and 
satisfies (w, v) = (v, w). This tells us that (cv, w) = C(v, w). 

We can write a matrix with respect to a basis B = (v,--- , V,). Then the matrix satisfies A = (aj), aij = (Vj, Vj). 
To compute a form, we write v = BX, w = BY for some coordinate vectors X,Y. Then (v, w) = X*AY. 


Notice that A is a self-adjoint matrix, which means that A* = A’ =A. The proof is that 
VAX = XPAY Sa OCAYY* 
(because these are all 1 by 1 matrices), and therefore 
Y*AX = Y*A*X*® = Y*A*X 
which can only be always true if A = A*. 


Theorem 226 


Eigenvalues of a Hermitian matrix are real. 


In the 1 x 1 case, a Hermitian matrix is always just a real entry, so its eigenvalue is that real entry. In the 2 x 2 case, 


hy @ 
Qa fo 


The characteristic polynomial for this matrix is t? — (ry, + f)t + (4% — @a). Then the discriminant is 


Hermitian matrices look like 


(1 +)? — 4(nr — Ga) = (mn — )* + 4@a = (Nn — m)* + 4 Jal? > 0. 


Thus, applying the quadratic formula shows that all eigenvalues are real here, too. 
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Here's the proof in general: 


Proof. Suppose 2 is an eigenvalue of the Hermitian matrix A. Then there exists a vector X #0 such that AX = XX. 


Now 


X*AX = X*(AX) = X*AX =| AX*X 


and also 


X*AX = (X*A)X = (A*X)*X = (AX)*X = (AX)*X = X*AX =| AX*X 


since A* = A (as A is Hermitian). But now X*X = |X|? > 0 since we defined X to not be the zero vector, so \ = A, 


which means that A is real. 


Corollary 227 


A real symmetric matrix has real eigenvalues. 


(This is because real symmetric matrices are Hermitian.) 
Next, let’s think a bit more about some properties of the inner product. In R?, X and Y are orthogonal if X'Y = 0 


— more generally, X’Y = |X||Y| cos@. 


Definition 228 


For a real symmetric or Hermitian form on a vector space V, we say that v and w are orthogonal if (v, w) = 0. 


This is not worth thinking about geometrically, especially since we need to fix a basis and think of many dimensions. 


(Note that v and w are orthogonal if and only if w and v are orthogonal.) 


Definition 229 
Let W be a subspace of V. Then the orthogonal space to W, denoted W-, is 


Wt ={veV|(v,w) =0 VweW}. 


There is one unfortunate case: the form could be identically zero, which we don’t want to think about. 


Definition 230 
The nullspace N of a form is the set of all vectors v € V such that (v, w) = 0 for all w € V. (In other words, 
N=V+.) If N = {0}, the form is called nondegenerate. 


Here's a way to restate the definition: in a nondegenerate form, for any v 4 0 € V, there is a w € V such that 


(v,w) #0. 


Lemma 231 


A form is nondegenerate if and only if its matrix with respect to any basis Is invertible. 


Proof. Consider the representation of (v,w) = X*AY. If A is singular (not invertible), there exists a vector Y 4 0 
such that AY = 0. Then X*AY must be 0 for all X, so Y € N, and the form is degenerate. 

On the other hand, say that A is invertible. Then for any coordinate vector Y #0, AY #0. Therefore, we can 
always find X such that X*AY # 0; in fact, we can just take X = AY. So Y is not in the nullspace, and since Y was 


an arbitrary nonzero vector, the nullspace must be trivial. 
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If we have a subspace W of V, we can always look at the restricted form on W. Then the definition is exactly 
the same, except that the domain is just W instead of V. Then nondegeneracy comes up again in a more subtle way: 


it's possible a form is nondegenerate in one of the two spaces but not the other. 


Example 232 


1 0 
Say that dimV is 2, and we have the form defined by A = [ i; A is degenerate, but if W is the span of v1, 


the form is nondegenerate on W. 
(al 
On the other hand, if we take A = : ; , this is a nondegenerate form, but if we look at only on the span 


of v,, it becomes degenerate. 


This means that we should be careful. What does it mean for the form to be nondegenerate on W? It could 
mean that the elements of w aren't in the nullspace: for every w € W, we can find a v € V such that (w,v) 4 0. 
But it can also mean that when we restrict the form to W, it’s nondegenerate: for every w € W, there is another 


w’ € W such that (w, w’) = 0. So these are different things — usually we like the second definition more. 


Lemma 233 
The form on V is nondegenerate on W if and only if WM W+ = {0}. 


In other words, if w #0 € W, then w € W+, so there exists w’ such that (w, w’) #0. The proof basically follows 


from the definition of degeneracy. 


Theorem 234 


Given a form on V and a subspace W of V, such that the form is nondegenerate on W, we can write 


V=Wew!. 


In other words, WNW+=0 andW+W+=V. 


Proof. Choose a basis for V: B = (wi,--+ , We, Ukti.°** »Un) such that the w; form a basis for W. Then the matrix 
of the form with respect to B is 
A B 
M = 
C D 


where A is k by k and D is n—k by n—k. We want B to be 0, because the u; will then be orthogonal to wjs, and so 


every vector in V can be written as a direct sum of the wjs and the ujs. 


| 
So we'll change our basis with a matrix of the form M’ = P*AP. Letting P = F for some undetermined Q, 


pap =|} | I ° 
ae 0! 


which results in AQ + B where B originally was. But A is invertible, so we can just take Q@ = A~!B and we've 


A B 
CG DD 


diagonalized the block matrix! Notice that a change of basis does not change the fact that the matrix is Hermitian, 


so we don't need to verify that the bottom left block also disappears. 
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Let V be a vector space with an inner product (, ), and let W be a subspace of V. Recall that v1, v2 are orthogonal 
if (v1, vo) =0: let W+, defined to be {u € V | u | W}, be the orthogonal space of W. 

We say that a form is nondegenerate on W if for all w € W, there exists a w’ € W so that (w,w’) #0. (In 
other words, w doesn’t collapse everything.) We found last time that a form is nondegenerate on W if and only if 
the matrix of the form restricted to W is invertible. (This means every element in V can be written as w + u, where 
w €W,u€W-, in exactly one unique way. And this is equivalent to saying that W+W+ = V, while WnW+ = {0}.) 

Now, we'll talk about the orthogonal projection operator. Consider a map 7:V 4W CV. If v=w-+u, with 


weEW,uEeW-, then v is sent to w. This is a linear transformation satisfying the two defining properties: 
> If we W, then m(w) = w. 
- If ue W+, then m(u) = 0. 


There is a nice formula, but first let's discuss the concept of an orthogonal basis. 


Definition 235 


Let (V1,--- , Vp) be a basis of V. If (vj, vj) = 0 for all 7 As, which means that the matrix of the form with respect 


to V is diagonal, then the basis is orthogonal. 


We now want to show that our vectors have “positive norm,” so that we can form a basis with them: 


Lemma 236 


If a form (,) is not identically zero, then there exists v such that (v, v) 4 0. 


Proof. We know there exists v1, vo so that (v1, v2) = c #0. Now, change vo = c~!vp (since the form is linear with 


respect to the second variable), so we have (v1, v2) = 1. Now (vo,v%1) = 1=1 and 


(Va + Va, Vi + V2) = (V1, V1) + (V2, V2) + (V2, V1) + (V1, V2) = (Vi, V1) + (V2, V2) + 2 


so at least one of the (v, v) terms in this equation is not zero. 


Proposition 237 


There exists an orthogonal basis for any nondegenerate Hermitian form. 


Proof. \f the form is identically zero, any basis is an orthogonal basis. 
Otherwise, we pick v, such that (v1, vz) # 0. Let W be the span of v,. Then the matrix of form restricted to W 


is ((v1, v1)), which is nondegenerate. Thus V =W @W-4, but we can induct on the dimension to show that W+ also 


has an orthogonal basis. Tack on vy, to this basis and we're done. 


Theorem 238 (Projection formula) 


Let a form be nondegenerate on W with an orthogonal basis (w1,--- , Wx) (on W). Then if v € V, 


WV) = Wier +--+ Wc, 


(v,wi) 


where c = om 
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Notice that (w;, w;) is one of the diagonal entries of a diagonal matrix that is invertible, so it must be nonzero. 


Proof. \t is sufficient to show that m(w;) = w; and m(u) = 0 for all u € W+. If u is orthogonal to W, then (u, w;) = 0 
for all i, so m(u) = 0. 


On the other hand, plugging in a basis vector w; into the above definition, we have 
a(my) = Soma = Sow 
J Te 1 (wi, wi) : 


But this is an orthogonal basis, so cj is 0 except when / = J, in which case it is 1. So m(wj) = 0+---+0+wj-1 = wj. 


Example 239 


Let V = R® with the standard form (dot product), and let W be the span of (wi, w2), where w; 


1 
. What is m(v), where v = | 2]? 
3 


1/2 3/2 
6 —3 
mv) = 3m ge 2| — |1/2| =| |3/2 
2 —1 3 
One special case of this formula is when W = V. Then if (v1,--- ,V~) is an orthogonal basis for V, then v = 


(v,vi) 


WiC, + +++ + VaCq, Where cG = aay: 


(Basically, this gives us a way to decompose our vector into components along 


the basis vectors) 


Example 240 
al al 
Take wy = —1], and keep v = 


0 


1}, 
1 


Then (w3, w3) = 2, and (v1, w3) = —1. Therefore, the projection formula tells us that 
1 1 
v=2w,—- qe — 5": 


(We can verify this by direct computation as well!) 

It’s tempting to get rid of the denominators in the projection formula. If (w;, w;) is positive, we can divide w; by 
the square root of (w;, wj), and this just gives us cj) = (v,w;). On the other hand, if (w;, w;) is negative, we'll get 
a denominator of —1. However, the vector will then have a square root term - division is better than having square 


roots, so (again) it is advised to not do this. 


Example 241 


Let V = R®*?, and let (A, B) be the trace of AB, ay1by1 + ay2b12 + ao, bio + azgbr2. 
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Let's choose an orthogonal basis for this vector space. This is not too hard: start with 


walt ° roe wal? 2 wal? 
beg) Oe a) a oo] ieee” 
1 2 
Then (v1, V1) = (v2, vo) = 1, (v3, v3) = 3 and (v4, v4) = —2. So now we can cecompose v =|! a} we that 


(Vv, V1) = 1, (Vv, vo) = 4, (v, v3) = 5, and (v, v4) = —1. So 


5 1 
v= Away Wer 5 Ves 54 


; ‘ ; i E ‘ 5 i |! “4 
= + + 4 
3 4, Jo of fo 4] [5/2 0 1/2 0 


Note that v1, v2, v3 form a basis for the symmetric matrices, so if we wanted to project v to the space of symmetric 


which is true because 


matrices, v would just become v + $ V4. On the other hand, if we project to the space of skew-symmetric matrices 


(spanned by v4), we just get V4. So this is a useful way to think about subspaces of our original vector space! 


28 November 19, 2018 


Today, we will discuss linear operators on a Hermitian space. First, a bit of review: let V be a complex vector space 


with a Hermitian form (,). Let's say there exists an orthogonal basis (v1,--- , V;); that is, (vj, vy) = 0 for all 1 4 /. 


Definition 242 


If (v;, vi) > O for all 7, then we can scale our vectors so that (vj, v;) = 1. Then this is called an orthonormal basis, 


and the matrix of the form is the identity. 


In this basis, the form becomes the standard dot product (v,w) = X*Y, where v = BX, w = BY with respect to 


B. An orthonormal basis exists if and only if the form is positive definite: (v,v) > 0 for all v. 


Definition 243 


A vector space V with a positive definite form (,) is a Hermitian space. 


We can also change from orthonormal bases to other orthonormal bases: the rule is that if B, B’ are two bases, 
then we can write B’ = BP for some specific matrix P. Let M be the matrix of our positive definite form with respect 
to B, and let M’ be the matrix with respect to B’. Then since M = M’ = | (we started and ended up with an 
orthonormal basis), 

MW=P*MP = 1=P'IP=P'P. 


Definition 244 


A matrix P is unitary if P*P = /. 


(This is very similar to being orthogonal in the real-valued case.) 
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Example 245 


What are the unitary matrices for 2 x 2 matrices? 


Let P= ° -_ [x Y] for column vectors X,Y. Then 
Cc 
pep. |aatcc ab+td) _ |x|? x*y 
ba+dc bb+dd Y*X Nala 


So if this is the identity, we just want |X| = |Y| = 1 and X*Y =0: our column vectors form an orthonormal basis. 
We can also think about unitary operators in the framework of linear operators! Let V 4, V bea linear operator, and 
let A be the matrix of T with respect to an orthonormal basis. Then the change of basis matrix from B to B’ for a linear 
operator is A’ = P~'AP. But if B’ and B are orthonormal bases, then P must be unitary, so P*P = / = > Put = P*. 
Therefore, A’ = P*AP is true as well, so changing the basis for the linear operator and the Hermitian form gives us 


the same expression. 


Definition 246 


Given an operator V 4, V with matrix A, the adjoint operator is the operator T* corresponding to the matrix 
AY. 


Here’s a characteristic property for adjoint operators: 


Proposition 247 


For a linear operator T and its adjoint operator T*, we have (Tv, w) = (v, T*w). 


We can also do the same thing the other way around: we have (v, Tw) = (T*v, w). 


Proof. Let v = Bx,w = By with respect to a basis B. Then the coordinate vector of Tv is AX, so 
(Tv, w) = (AX)tY = X* AVY. 
Similarly, the coordinate vector of T*w is A*Y, so 


(v, T*w) = X*(A*Y) 


and indeed the two are equivalent. 


We can check that this definition still holds when we change to a different orthonormal basis: if we let B be the 


new matrix for 7*, then 
B=P*A‘P 


but B* = P*AP = A’, so we still have B = A”™ and everything works. 


Lemma 248 


If v1, V2 EV, and (v4, w) = (v2, w) for all w € V, then yy = vo. 


This is true because by linearity, we have (vy — vo, w) = 0 for all w, but the form is positive definite, so it can only be 
zero when we plug In Ww = vy — v2 If w= VWy-— ve = 0. 


Here's a natural extension of what we've been discussing: 
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Definition 249 


A linear operator T is a Hermitian operator if 7* = 7; that is, for the associated matrices, A* = A. 


Looking back at the characteristic property, we have (Tv, w) = (v, Tw). Indeed, it’s true that (AX)*Y = X*(AY), 


because A* = A for a Hermitian operator. 


Definition 250 
A linear operator T is unitary if 7* = T~1; that is, A*A = /. 


This one has a characteristic property as well: 


Proposition 251 


For a unitary operator T, (Tv, Tw) = (v,w). 


Proof. We want (writing the inner products in martix form) the equality (AX)*AY = X*Y, which is satisfied for all X 


and Y as long as A*A = /. Alternatively, we could have used the adjoint operator: 


(Tv, Tw) = (v, T*Tw) = (v, w). 


Finally, we talk about a more general category of operators: 


Definition 252 


A linear operator T is normal if 7*7 = TT™*. 


This includes both the unitary and Hermitian operators, since unitary operators have both equal to the identity and 


Hermitian operators have T = T7*. 


Proposition 253 
(Tv, Tw) = (T*v, T*w) for normal operators T. 


Proof. We use the adjoint property: (Tv, Tw) = (v, T*Tw) = (v, TT*w) = (T*v, T*w). 


The normal operators are completely uninteresting in general, but we have an important result (which we will prove 


next time): 


Theorem 254 (Spectral Theorem) 
Let V 4 V be a normal operator on a Hermitian space V. Then there exists an orthonormal basis of eigenvectors 
of T in V. 


Corollary 255 


For any Hermitian matrix A, there exists a unitary matrix P such that P*AP is real diagonal. If A is unitary, there 


exists a unitary P such that P*AP is diagonal and unitary. 
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In particular, let's say A was already diagonal and unitary. Then if those entries are a,,--- , an, A*A is the diagonal 


matrix with diagonal entries a;a;. So this means all entries must have absolute value 1. 


Example 256 


G 
Take A to be the rotation matrix 


| where c = cos@ and s = sin@. Then A*A = /, so this matrix is 
G ic 


unitary and orthogonal. 


Then the Spectral Theorem says there exists a unitary matrix P such that P*AP is diagonal: it turns out (after 


e® 9 
some computation) to be P*AP = ae 
0 ee 


29 November 21, 2018 


We'll start with some review. Let V be a Hermitian space with a positive definite form (,). (If B = (v,--- , Vp) is an 
orthonormal basis, then the form is just (v, w) = X*Y. And if we want to change between two orthonormal bases via 
B' = BQ, then Q is unitary.) Then if 7 is a linear operator V as V with matrix A with respect to 6, then the adjoint 
operator 7* has matrix A* = A’. 

Generally, we always have (Tv, w) = (v,7T*w), or equivalently, (T*v,w) = (v,Tw). T is called Hermitian if 
T* = T; then we also have (Tv, w) = (v, Tw). T is unitary if 7*7 = /; in this case, we have (Tv, Tw) = (v, w). 
Finally, T is normal (which encompasses the first two categories) if 7* and T commute. In these cases, we always 
have (Tv, Tw) = (T*v, T*w). 

The main result of today is the spectral theorem, which says that if 7 be a normal operator on a Hermitian space, 


then there exists an orthonormal basis of eigenvectors of T in V. We can also state this result in matrix form: 


Theorem 257 (Spectral Theorem, matrix form) 


Let A be a matrix such that AA* = A*A. Then there exists a unitary Q such that Q* AQ is diagonal (which is the 


eigenvector basis). 


Two special cases: if A is Hermitian (A* = A), there exists Q unitary such that Q*AQ = D is a real diagonal 
matrix. Meanwhile, if A is unitary, we can find Q such that Q*AQ is unitary and diagonal; in particular, the diagonal 
entries must all have magnitude 1. 

To prove this result, we'll start by thinking about invariant subspaces. If V +, V is our linear operator, and W isa 


subspace of V, then a subspace W is T-invariant if TW CW. 


Lemma 258 


If W is T-invariant, then W+ is T*-invariant. 


Proof of lemma. Take a vector u € W+; we want to show T*u € W+. In other words, we need to show that 
(T*u, w) =0Vw ew. 


But by properties of the form, (T*u,w) = (u, Tw), and since Tw € W (by assumption that W is T-invariant) and 


u€W4, this must be equal to 0. Thus uw is still in W+ under T*, so W+ is T*-invariant. 


#1 


Lemma 259 


If T is anormal operator with eigenvector v such that Tv = Av, then v Is also an eigenvector of T* with eigenvalue 
. 


Proof of lemma. First, we do a special case. If X = 0, then Tv = 0. Since T is normal, 
(T*v, T*v) = (Tv, Tv) = (0,0) =0. 


But the form is positive definite, so this can only be 0 if 7*v is the zero vector, and indeed v is an eigenvector of T* 
with eigenvalue 0. 
Now let’s say ) is arbitrary. Let S = T — A/. Then 


Sv=Tv-Av=0, 
so v is an eigenvector of S with eigenvalue 0. Notice that S is normal, because 


SSS 7-07 =) 
eg Ee 


while 
SS* =TT*—AT—-AT* +I 


which are clearly equivalent since T7* = T*T. Thus, by the special case above, v is also an eigenvector of S* with 


value 0. This means Stv =O = T*v—Av=0 = T*v=Av, completing the proof. 


With this, we can finally prove our main result: 


Proof of the spectral theorem. We are given a normal operator V en V. Pick an eigenvector v, such that Tvy = AiVv1, 
and let W be the span of vy. The form is nondegenerate on W, soV =W@ Wt. Notice that W is T-invariant (T 


just scales W), so TW+ Cc WE. In other words, we can restrict the operator T to W+, and it will still be normal. 


Now just induct: take v1 along with the orthonormal basis of W+. 


Proposition 260 (Polar Decomposition) 


Every invertible complex-valued matrix can be written uniquely as a product of the form A = UP, where U Is 


unitary and P is positive definite Hermitian. 


In other words, GL,(C) is bijective to U, x P. 


Lemma 261 


If A is invertible, then A*A is positive definite and Hermitian. 


Proof. This matrix is Hermitian because (A*A)* = A*(A**) = A*A. Meanwhile, for all x 4 0, we have that 


0 < (Ax, Ax) = x*A* Ax 


so A*A is indeed positive definite. 


#2 


Proof of Proposition 260. Using the previous lemma, we know by the Spectral Theorem that there exists a unitary Q 
such that Q*(A*A)Q = D is diagonal with positive real entries. If D has diagonal entries d;,--- , dp, let R have only 
entries r; = /d; on the diagonal. Then 


Q*A*AQ = R? KA=ORG YS. 


Since Q is unitary, QRQ* is just a change of basis. But this means that P = QRQ%* is also positive definite and 
Hermitian, so 
AA=P? = PPP. 


Therefore, (P* 'A*)(AP7!) = I, so (AP~1)*(AP7!) = I. So U = AP™' is unitary, and A = UP — we've indeed 
written A in the desired form. 


Finally, showing uniqueness comes from the fact that any matrix that is both positive definite Hermitian and unitary 


is the identity matrix (because all eigenvalues must be 1). 
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Definition 262 


A quadric Q is the locus of points that satisfy f(x) = 0 for some quadratic equation in R”. 


Let's start studying these objects with some linear algebra: 


Definition 263 


A real vector space V with a positive definite symmetric form (,) is called a Euclidean space. 


We use orthonormal bases (just like we did in Hermitian spaces), and a change of orthonormal bases is given by 
PX’ = X for some orthogonal matrix P. As before, a linear operator V Vis symmetric if its matrix A with respect 
to an orthonormal basis is symmetric. Alternatively, we could just say that (Tv, w) = (v, Tw) (expand out the forms 
of (AX)*Y and X*AY, noticing that A = A*). 

Then the Spectral Theorem holds here, too: 


Theorem 264 
There exists an orthonormal basis of eigenvectors for a symmetric operator. In other words, if A is a real symmetric 


matrix, then there exists some base change P, which is orthogonal, such that P*AP = D Is diagonal. 


Let's try to show this. 


Remark 265. The following are just special cases of what we discussed above! For example, the lemma below 


is just a special case of the lemma from last class. 


Lemma 266 


Let T be a symmetric operator. If W is T-invariant, which means TW CW, then W+ is also T-invariant. 
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Proof. \f u€ W+, then (u, w) = 0 for all w € W. Then 


(Tu, w) = (u, Tw) =0 


since Tw € W and u is orthogonal to anything in W. Thus, Tu is orthogonal to all w € W, so Tu € W+ as well. 


Lemma 267 


Eigenvalues of a real symmetric matrix are real. 


Again, this is just a repeat of a previous proof. 


Proof of Theorem 264. Choose an eigenvector vy, such that Tv, = Av,. Let W be the span of v4; the form is 
nondegenerate here because it is positive definite, and it is also nondegenerate on W+. Normalize v; to have length 1. 


Thus, V=W @W4, and by induction on the dimension we have some orthonormal basis (V2,°+* , Vp) for W-+. Then 


V has an orthonormal basis (v1,--- , V,) as desired. 


Let's use this to classify quadrics. We'll start with conics: this is the locus of points that satisfy 


aX? + a12X1X2 + aoxs + bx, + box. +c = 0. 
We'll classify such quadrics up to an isometry f of IR, which we can write in the form ty@, where ty(x) =x+visa 
translation and @ is some orthogonal operator (reflection or rotation). 


XL ; : 
, our conic looks like 
x2 


In matrix form, if X = 


q(x1,%2) = X'AX + BX +C. 


But we want a symmetric matrix, so let’s split the ay2x,X2 term into a part for x,X2 and a part for x2x;. In general for 


S> xia + D> bixi +.¢ 
iy ; 


where aj; = aj; Thus we've made A into a symmetric matrix, so we can diagonalize and remove all cross terms x;x;. 


a quadric, this would look like 


In other words, applying the change of coordinates X = PX’ for some orthogonal matrix P, we find that A’ = P’ AP 


is diagonal, which results in (dropping the primes because it’s not necessray) 


2 2 
q(x, X2) = arrxp + aooxs + by x, + boxe + € 


(For a general quadric in n variables, this is > aiiX? + S> bx) + c.) But now we're allowed to apply a translation. We 
can complete the square for each variable and translate by vj = —2, and this will toss all of those linear terms out! 


So now our new form is (almost always) 
_ 2 2 
q(X1, X2) = arxzp + ao2x5 + € 
and in general, this means a quadric is (almost always) equivalent to >> aux? = ¢: 


Remark 268. Well, there is an edge case. If a,1 = ao9 = 0, that would just be a linear equation, so ignore that case. 
But it’s possible that (for example) we have one of the quadratic terms disappearing, SO ao2 = 0. To avoid degeneracy 
in this case, we must have a1,,b2 # 0. Then we can translate in x. to remove the constant and translate in x, to 


remove the x, term, and we have an equation of the form ay1x? + box2 = 0, which is a quadratic. ) 
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Assume we're not in that degenerate edge case, and WLOG let a3; > 0. Then we can write our equation as 


2 2 a 
411X1, 1 a90Xy = C. 


If we have a +, then we get an ellipse for c > 0, just the point 0 for c = 0, and nothing for c < 0. (The latter two 
are “degenerate conics.”) On the other hand, we have a —, then that’s a hyperbola for c 4 0 and two parallel lines for 
c =0. Again, the latter case is degenerate. 

Well, we could do this in any number of variables if we wanted — we just look at the number of signs that are 


positive versus negative and classify into different surfaces. 


Example 269 


What about quadrics in three dimensions? 


We have q(x1, x2,.x3) = X’AX + BX +C; make A diagonal again using the Spectral Theorem. Now, if aj; 4 0 for 


all 7, we translate to get rid of bs, which gives 


2 2 ——_ 
ajjXz + a29Xq TE a33X3 = C 


(we always have 0, 1, 2, or 3 plus signs, and we can multiply by —1 if we have 0 or 1). Three pluses give an ellipsoid 
for c > 0, two pluses give a hyperboloid if c 4 0. 
Now what about a1 x? + an0Xs — a33X$ = 0? This equation is homogeneous, so scaling still satisfies the equation. 


This turns out to be a cone through the the origin! While this is a degenerate quadric, it is still pretty interesting. 


Remark 270. Suppose we want to sketch a picture of the locus of points g(x,,X2) = c for some c & 0, given the 
graph of g(x1,X2) = 0. The idea is to draw regions in the plane where the function is positive and negative, and we 


just trace near the curve g = 0 either on the positive or negative side! 


So the locus of a11x2 + aoox? — a33x2 = c for c > O is a vase-shaped object, and the locus for c < 0 is two 
1 2 3 
cup-shaped objects. And to complete our classification, we have our paraboloids, of which there are two kinds: the 


standard bowl (which looks like x3 = x? + x3) and the two-sheeted hyperboloid (which looks like x3 = x? — x3). 
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Today we're going to talk about a special group: 


Definition 271 


The special unitary group SU, is the set of matrices A € GL2(C) that satisfy A*A = / with determinant 1. 


; ; a b ac Z d ; _ 
In particular, if A = A= aa must be equal to A7! = . This means d =@andc=-—b, 


c d —c a 
and furthermore, da + bb = 1. 


If we write a = Xp + X1/, b = Xo + x3/, then a necessary and sufficient condition is that 
xp Expt xs txZ =. 


This is the unit 3-sphere in R*, since the surface has dimension 3. 
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One way to analyze this is by stereographic projection. Let the xg axis point north, and note that the identity 
matrix / corresponds to the point (1,0,0,0). So let / be the north pole; then for any matrix A, project A onto the 
plane xo = 0. This is defined everywhere, except the north pole / gets sent to infinity. Here’s a visual depiction (from 


Google) of the stereographic projection of the 2-sphere: 
N 


2 
ae, 
Let’s compute the formula for this projection explicitly: let A correspond to point (Xp, Xz, X2, x3). Then the line / 


will be the line / + t(A—/) for some scalar t, which is equal to (1 — t+ txo, tx, txo, tx3). Setting the first coordinate 


to zero, we want t = phe Thus, we've found the projection of SU> down to R?: 


—Xo° 
XL x2 X3 
A)= ‘ 
7m ) (0.5 Xo’ 1 xo 1 =) 


When we project a sphere in 4-space down to 3-space, we'll get an ellipsoid. Consider the intersection of x9 = 0 with 


the unit 3-sphere XG +x? + x3 +x = 1, and call this the equator E. The interior of this region can be represented as 


x + co + x3 <1, which is B%, the unit 3-ball. Notice that there are two points that are directly above or below each 
point inside the equator (except the boundary). 


The lower half of the sphere goes inside the equator and the outer half of the sphere goes outside the equator 


under this stereographic projection. Also, every point in V (the hyperplane containing E) corresponds to a point on 
the sphere; we just need to add a point at infinity to get /. (This is kind of like adding one point to a line to get a 


circle.) So if we let S? be a 3-sphere, we can represent this as 
SrVuti} 


which is R® plus a point, or 
Ss = Bue. 
E 


SUp is very symmetric; let’s undo some of that symmetry! Let E be the set of points in SU> that satisfy x9 = 0. 


; a ob. : 
The trace of the matrix os | iS 2X, Since 2 = Xo + X;/. In other words, the equator E consists of those points 
a 


where the trace of A is 0. 


Definition 272 


A latitude of SU> is the locus of points x9 = c for some —1 < c < 1. This is equivalent to saying that the trace 
of A is 2c. 


In particular, the points in SU2 on a given latitude are those that satisfy x? + x$ + xf =1-—c?. 


Proposition 273 


The latitude xo = c is a conjugacy class. 
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Proof. We'll use the Spectral Theorem! A is unitary, so there exists P unitary so that P*AP is diagonal. This diagonal 
rA 0 
matrix will be of the form D = ah and this still has determinant 1, so A must have absolute value 1. If P had 


determinant 1, we'd be done (because we would show that every matrix of a fixed trace is conjugate to D). Well, the 
determinant of P* and the determinant of P are complex conjugates of each other, and P*P = /, so the determinant 
of P is equal to 6 for some |6| = 1. 
ce O 

0 6-1/2 
still be diagonal, so every A in SU» is conjugate to a diagonal matrix. Well, conjugate elements have equal eigenvalues. 


Now just change P to P, and the determinant of P (and therefore also P*) will be 1. P*AP will 


so we can't have two diagonal matrices of this type with different traces cannot be conjguate. 


r»A 0 _ 
Thus our matrix is conjugate to a matrix of the form ; | , where A+ A is 2c (So N= cC+V1-—C?/). These 


: . : : 0 1 ue 2 
two matrices are also conjugate to each other (if we conjugate by ; ), SO X9 = c is indeed a conjugacy class, 


as desired. 


Now, let’s define a Hermitian form on SUs>: 


Definition 274 


a b 
Say that a matrix A = i € SUp> corresponds to (Xo, x1, X2, x3). If B similarly corresponds to (yo, ¥1, Y2, ¥3); 
G 


define the bilinear form (A, B) = x-y. 


In other words, we carry over the (real-valued) dot product on R* over to SU2, and there's a nice way to describe 
this: 


Proposition 275 


The Hermitian form on SUp satisfies (A, B) = 5tr(A*B). 


< ; . a b a £6 
Proof. This is a direct computation: Let A=] _ _|, andlet B= B al’ Then 
—-b a -B @ 
AtB = \20+ bG 7 ? 
? bB + aa 


So the trace is 
(aa + aa) + (bB + bB). 


Now (aa + a@) = (Xo — X1/)(Yo + 1!) + (X%0 + X1/)(Yo — Wf); the cross-terms cancel and we're left with 2xoyo + 2x11. 


The other term gives us 2x2y2 + 2x33, and plugging everything in indeed yields our result. 


One nice property of this form is that (A, A) = $tr(A*A) = 1. Note that if A is on the equator, 


(1, A) = Sta) = st(A) =0. 


So | and A are orthogonal unit vectors for any A € E, which makes sense: the north pole is orthogonal to the equator. 


Definition 276 


A longitude of SU> is a 1-dimensional great-sphere L through the north and south pole / and —/. 


a 


Here's one way to describe a longitude more explicitly: take a two-dimensional subspace W of R* which contains 


(+1,0,0,0), and intersect it with S3. This will intersect the equator at two points: call them A and —A (they are 
antipodes). Then (/, A) forms an orthonormal basis for W, since the two vectors are orthogonal. 

This longitude L is then a unit circle in our two-dimensional subspace W. Since / and A are orthonormal, L consists 
of the points {cos @/ + sin@A}. This will have length 1 because 


CtA cd esha jas AA Sac ee = 1; 


By the way, it’s interesting that the 1-sphere has a group law and so does the 3-sphere, but the 2-sphere does not. 
It turns out those two are the only spheres with group laws, and that’s a theorem in topology. This is related to the 


Hairy Ball theorem, and we should search up a paper by Milnor for more information! 
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Today we're going to talk about SO3, but it'll take a while to get there. Recall the definition of SU: it is the set 


b _ 
of all matrices of the form A = - | where aa+ bb = 1. This can be bijected to the 3-sphere in R*: letting 
—b 2 


a= Xo +X1/, b = Xo + x3/, we want xg + x? + x + xf = 1. The characteristic polynomial of A is t? — 2x9t + 1, so the 


eigenvalues of our matrix A are x9 + a/1 = xi, which is cos@ +/sin@ for some @. 

Also recall that we defined a form on SU> as well: if A corresponds to X in R* and B corresponds to Y, we just 
want (A,B) = X-Y. This turns out to be equal to 5trA*B. 

We also had a geometric interpretation of all of this: define the equator E to be the {A € SU> | x) = O}. A 


being on the equator is equivalent to having a trace of 0, which Is equivalent to saying the eigenvalues are +/, which 
is equivalent to saying A* has a double eigenvalue at —1, so A? = —/ if A has to be of this form. This is additionally 
equivalent to having / L A (since (/, A) = 0). Finally, this is equivalent to A being skew-Hermitian: A* = —A, since 
a must be pure imaginary. Thus, the equator is a real vector space of dimension 3 (b can be anything and a is pure 


imaginary); an orthonormal basis of the equator is 


fe ee ee 


Denote these /, /, k respectively. These satisfy the quaternion relations 


f=ji= kh jka—kj =i ki = ik =f 


As a sidenote, these are actually / times the Pauli matrices. If we multiply a skew-symmetric matrix by /, we get a 


Hermitian matrix, so we end up with the three Hermitian matrices 


pal edb d 


Recall that the equator E is a conjugacy class in SU, so we can say that SU> operates on E. Given P € SU>,A EE, 


the operation is that 
P*x A= PAP*. 


Geometrically, the equator E is a 2-sphere in V(x;, Xo, x3) (since xg = 0). 
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Theorem 277 


The operation of P € SU> on the equator E is a rotation. 


b 
One way we could show this is by writing A = 7 ‘| do a similar thing for P, and show that PAP* gives 
a 


another matrix similar to A. But this is disgusting computation; instead, let's use Euler’s theorem. 


Proof. We know that rotations are elements of SO3 if they are linear operators, orthogonal, and have determinant 
1. Thus, we just need to verify that conjugation by P satisfies all of these! For linearity, we just need to show 
P(A, + Ao)P* = PA, P* + PAsP* (true by the distributive law) and that P(rA)P* = rPAP* (which is also true). 
To show orthogonality, we show that the operator preserves the Hermitian form: we want (A, B) = (PAP*, PBP*). 
We know that 
(A, B) = str(A*B). 


Note that j ; 
(PAP*, PBP*) = pA ORBe) = 5tr(PA'BP"), 


and conjugation preserves eigenvalues, so the two above expressions are equal since the trace is the sum of the 
eigenvalues of a matrix! 
Finally, for determinant, we argue by continuity. SU> is path-connected, so draw a path from / to P. Compute the 


value of MAM* as M ranges along this path — note that orthogonal matrices have determinant 1 or —1. But MAM* 


is orthogonal, so we can't get to —1, and therefore PAP* must have determinant 1 as well. 


So what is the rotation? We want to determine both the axis and the angle given conjugation by some matrix P. 
To find the axis of rotation, we want to find the fixed point (since it'll just be the line through the origin and that 
fixed point). So given P, we want to find a matrix A such that PAP* = A. Let's say P is on some given longitude 
(recall this is a great-circle through / and —/). If the great-circle intersects the equator at a point B (it'll intersect at 
two points, pick one), then we can write 
P=cos6/+sin@B. 


So a direct computation shows that 
PAP* = (cl + sB)A(cI* + sB*) = (cl +sB)A(c!l — sB) 


(B is on the equator, so B*B = / and B? = —]. This means B* = —B; that is, it is a skew-hermitian matrix). Now 


note that B commutes with c/ — sB, so 
PBP* = (cl +sB)(cl —sB)B = (c?B—s*B’) =(c?+s*)B=B 


and so B ts fixed — we've found our fixed point! 
This just leaves our other question: what is the angle of rotation? If we conjugate by / or by —/, we get the 
identity, so it seems like the angle is moving twice as fast as we do around our 3-sphere. So we can make the following 


guess: 


Proposition 278 


(Let B be a point on the equator.) The angle a of rotation by P = c! + 5B, where c = cos6,5 =sin@, is + 
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Proof. Take a point on the equator A € E such that A L B. Letting C = AB, we want to check that (A, B, C) is an 
orthonormal basis of E, with relations AB = -BA = C, BC =—CB=A,CA=-AC=B,A° = B?=C?=-]/. 
Notably, since tr(A*B) = 0 (by construction) and A* = —A (we showed that matrices on the equator are skew- 
hermitian), tr(AB) = 0 as well, so C € E. Verifying the other relations is to do calculations like -(AB) = (—B)(—A) = 
BA. So now 


PAP* = (cl + sB)(A)(cl — sB) = c?A— s*BAB + cs(BA— AB) 


and since AB = C, BA= —C, and BAB = BC = A, this is equal to 


(c* — s*)A — 2csC = cos(20)A — sin(20)C 


so a = —20. 


So now consider the map 
operates on E 
> 


SU> SO3. 


The kernel of this map is {+/}, and the map is surjective, so we finally have a way of representing our group SO3: 


SO3 = SUz/{+]} |. 


In one dimension, this is saying that if we take half a circle and glue the two endpoints together, we get a circle. In 
two dimensions, if we take half a sphere and glue the opposite diametrical points, we get a Mobius band. This is 
a non-orientable surface, and it’s called the real projective plane RP?. Finally, given a 3-sphere, the picture is just 


completely confusing. But SO3 is also called the real projective 3-space RP?®. 
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Recall that the exponential function has the power series 


ey 


ee, ee ee 
1! 2! 


Definition 279 


The matrix exponential is defined as 


Professor Artin imagines that someone was completely clueless and came up with this as a definition. This has 


some properties: 


+ This series uniformly converges on any bounded set of matrices. This is analysis, and it’s not too hard to prove. 


* Conjugating e* with P does something nice: 
PeAp-) = @PAP™ 


This can be easily checked by just plugging both sides into the definition and using distributivity. 
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- The eigenvalues of e4 are e*, where 2 is an eigenvalue of A. This is because if we have Av = Av, then 


A — / A i A = A ae i 
ev +74 FI Is oe Vv Cray an 
and this is equal to 
2 
Cr as oye =ev 


- If AB = BA, then e4+8 = &4e8. 


1 
. Then 


0 
Let’s do an example: take A = E 


1 0 
Similarly, if we take B = F i since B2 = B, we get 


op SON a [te Ol 3 ts a 20 
a= aor a ae 
o 1) Iljo of 210 oO 
But notice that AB 4 BA, and e4+® F e4e® in general. 


(One comment: it’s easy to compute the exponential for a diagonal matrix; just exponentiate each entry.) 


- Now let’s write down the matrix exponential e“' as a function of t: 


1 1 
At _ I | 28 te Se 


Notice that t is a scalar, so we must have 
eAstAt _ es At 


for any scalars s, t, because As and At commute. Thus, we have a homomorphism Rt > GL, via t > eft. 


« eAt is a differentiable function of t; in particular, define 


A(t+At At 

Co ey ERO Se 

—e* = lim ———____.. 

dt At30 At 

This can be written as 

' eAAt At — e eAt ; eA(0+At) = eA0 sie a d At) 
mM A. = lim 2X Ae = ee’ —(e’" )~0. 
At—0 At At—0 At dt 


But working with this isn’t actually necessary for computation. Instead, 


d,,,1 el goes ee ee eee oe a 1 — f4AAt 


What's important here is that e“* solves the differential equation a = Aw. 
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Definition 280 


A one-parameter group in GL,, is a differentiable homomorphism 


satisfying ¢(s + t) = ¢(s)@(t). Its derivative is defined to be 


dp_ , _ ot +At)— o(t) _ do 
Sb eke At che oe 


(ust like above). 


So any one-parameter group satisfies ag = Ad for some matrix A. It turns out all one-parameter subgroups look 


like matrix exponentials: 


Proposition 281 


If ae = Ad, where ¢ is a one-parameter group, then 6(t) = e”*. 


Proof. Consider e~“*¢(t). By the product rule, 


d —At = —At —At d@ 
ar: (e “*p(t)) = Ae “*P(t) +e aE 
= —Ae“'p + e **(AQ) 


=0 
since A commutes with e~“* (which is just a sum of powers of A). Thus e~“'@(t) must be a constant matrix, so 


e A“o(t) =C o(t) = Ce*. 


Putting t = 0, e°6(0) = C, but e® is the identity and #(0) is as well (because we have a homomorphism). Thus C = / 
and $¢(t) = e**. 


Example 282 


What are the one-parameter groups in the orthogonal group O, C GL,? 


In other words, we want a differential homomorphism Rt > GLy with image in O,. Thus, we want ¢ = e“t to be 
orthogonal for all t, which means 


(e"*)*(e"*) —/| 
for all t. Let's differentiate: (e4*)* = e** (look at each term in the definition), so by the product rule, we have 


(Aer ea) ais (e""*)\(Ae™) —0 


which implies that we want | A* + A =0|. Plugging this back in, all such matrices work: 
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Corollary 283 


The one-parameter groups in O, are of the form e“* for a skew-symmetric matrix A. 


Ab 1 0 1j0 -1 1 |-1 O 5 1/0 J) 4 cost —sint 
eAt — i oo a Bye a |” 
0 1 l!j1 0 2!|Q  -1 3!|-1 0 sint cost 


so the one-parameter groups are the obvious rotation matrices. 
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Recall that a one-parameter group in GL, is a differentiable homomorphism Rt ae GLy; we know that 6(t) = e** 
must be a matrix exponential, where A is the derivative of the homomorphism at t = 0. 
We found the one-parameter groups in O, C GL,y to be e4*, where A* = —A is skew-symmetric. We'll continue 


this kind of characterization now: 


Example 284 


What are the one-parameter groups in SL, C GLy? 


We want A such that e“! has determinant 1 for all t. Luckily, we have a nice formula: 
det e4 = et. 


This is true since the eigenvalues of e” are e*, where J is an eigenvalue of A, and the determinant of a matrix is the 
product of its eigenvalues. 


So if we want e“t to have determinant 1, and t is a scalar, we want 
ef(tr4) — 1 Vt. 


This happens if and only if the trace of A is 0, and all such matrices correspond to one-parameter groups e”*. 


1 O ; ; 4 e 0 0 1 ; 
For example for n = 2, we can have le which yields e4* = ‘4 alk We can also have Be 40)" which 
e 


; A 1 ¢t Oo 1] . cosht  sinht ; : 
gives e#t = . gives | | , and so on. (Keep in mind that the one-parameter groups always 
0 1 1 0 sinht cosht 


trace out a continuous path in our matrix space.) 


Example 285 


What are the 1-parameter groups in some subgroup of triangular matrices? Let's say, for example, that we want 


1s on the diagonal and Os below the diagonal. 


Then  (e4*) has Os on the diagonal, and evaluating 


d 
aac ) ~ Ae“t 
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at t = 0, we want A to have zeros on and below the diagonal. 


Example 286 


What are the one-parameter groups in SU>? 


We're supposed to know that SU> is a 3-sphere in R*, and longitudes are circles of the form cos @/ + sin@A for some 
A with trace 0. 

Well, now we're working in real numbers anymore. Let's say we want to look at SU2 C Glo(C). What does it 
mean for such a map to be differentiable? We can think of functions of a complex variable as real and imaginary parts. 
In fact, the maps here are given by complex solutions to = Ad, and the same proof still shows that @ = e4* — 
nothing funny happens here. 

We still want A to satisfy 


(ey? = (a) * dere = 18. 


The second condition means A has trace 0. Note that 


(e*)* = 14 


rare 2\* | 
uA t 5A") pea 


and since (A*)* = (A*)?, we see that this means 
(ey _ et 


(It is important that t is real, so it's equal to its complex conjugate and we don’t have to worry about conjugating it 
here.) So we want 


A 


e oe (ey — et At =. 


for all t. Differentiating, by the product rule, 
(A*e* fet) oe (e* Ae“) —0. 


Putting t = 0, A* +A = 0, so A should be skew-Hermitian and have trace 0. So the one-parameter groups are 


actually tracing out the longitudes in SU>, since a basis for these matrices is 


eat 8: ¢ is Cc 


Let's go back to looking in SL. / and —/ are in every 1-parameter group, but the rest of the longitudes partition 


et QO c Us c is 
These correspond to the one-parameter groups ' , and : 
e 


SU>2. So there's always a path to any matrix in SU: can we always get to a given matrix P in SL» along 1-parameter 


groups? (Some confusion came up during class, but the answer was discussed next class.) 
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Definition 287 


The Lie algebra of a matrix group G Is the space of tangent vectors to G at /. 
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Let's try to define this based off the calculus path tangent vector. If we have a path x(t) and p = x(tg), then the 


dx 


derivative v= Gl i—t- 


We'll make an analogous definition for matrices: 


Definition 288 
dx 


Let x(t) be a matrix-valued function. Then the derivative $ is also a matrix 


ie x(t + At) — x(t) 
At0 At 


Note that we just take the derivatives of the individual matrix entries with respect to f. 


Example 289 


What is the Lie algebra of SU>? 


A vector (aka matrix) A is tangent to SU, at / if there exists a path x(t) with the following properties: 
+ It exists for some interval —r<t <r. 

+ x(t) € SU» for all t, and x(0) = /. 

« x is differentiable, and “x at t = 0 evaluates to A. 


Well, for x(t) € SU> to be true, we must have x(t)*x(t) = / and det x(t) = 1 for all t. Let’s take the derivatives 


of those two statements. Note that (2)" = a so by the product rule, differentiating x*x = / yields 
dx* Regi dx 0 
x+x*—=0. 
dt dt 


Putting in t = 0, we have x(0) = x*(0) = /, and the derivatives become A* and A respectively. Thus, A* + A = 0, 
and A must be skew-Hermitian. 


Next, if the determinant of x(t) is 1, then the derivative of det x(t) is 0. Thus the derivative of x11x22 — X12Xa1 IS 


X41X02 + X11Xb0 — X}oX01 — X12X5, = 0. 
Plugging in t = 0, since x(O) = /, x11 = Xo2 = 1 and xy = Xo, = 0, and we have 
X11(0) + x49(0) = 0. 


Thus the trace of A is 0 as well. 


Remark 290. We could have guessed this because we know some paths already: x(t) = e“* is a path in GL» where 
the derivative of x at t = 0 is A, and it’s in SU> if A* = —A and the trace of A is 0. 


Corollary 291 


A is in the Lie algebra of SU> if and only if e4* is a one-parameter group. 


This actually turns out to be true when we replace SU> with any matrix group! 


By the way, we'll quickly address the question from last class: 


Fact 292 


Matrices in SL2(IR) that can be obtained along a 1-parameter group are those where the eigenvalues of P are 


positive and real or complex, but not negative real except for the matrix —/. 
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Let L be a Lie algebra (let’s say of SU> for now). 


Definition 293 


The Lie algebra is not a group, but it has a bracket operation which is a law of composition sending A, B € L to 


[A, B] = AB — BA. 


Let's check that this is still in the Lie algebra for SUz. We know that 


(AB — BA)* = B*A* — A*B* = (—B)(—A) — (—A)(—B) = BA— AB = —-(AB — BA), 


so AB— BA is skew-Hermitian. Then the trace of BA is (preserved under conjugation) equal to the trace of B-!BAB, 
which is AB. So the difference of AB and BA has trace 0, which is what we want! So the bracket operation indeed 


keeps us in the Lie algebra L. 


Proposition 294 
Lie algebras satisfy the Jacobi identity 


(A.B) ci ie. cl, Al lic Al s)—o. 


Let's check this: it’s equal to 


S“[(AB — BA)C — C(AB — BA)| = S“[ABC — BAC — CAB + CBA] 


cyc cyc 


which is clearly 0. It turns out the bracket acts a lot like the commutator: let's consider xyx~!y~}, and write x = 1+a 


and y = 1+ bsuch that a?, b? are small enough to be neglected, but not ab. Then the commutator is 


(1+ a)(1+6)(1-—a)(1—- b) = (1+a+ 6+ ab)(1-—a—b+ab)---=1+ab-—ba=1+[a, b]. 


Definition 295 (Actual definition) 
A real vector space L is a Lie algebra if it has a bracket operation that is bilinear and skew-symmetric ([A, B] = 
—[B, A]) satisfying the Jacobi identity. 


Example 296 
Take IR? with the cross-product [a, b] = a x b. It can be verified that this indeed satisfies the Jacobi identity! 
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Recall this concept from earlier in the class: 


Definition 297 


A group G ts simple if it has no proper normal subgroup. 
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We know that the icosahedral group is simple, and that alternating groups A, for n > 5 are all simple. Recall 
that a normal subgroup is a group that is closed under multiplication, inverses, and conjugation by elements of G. In 
particular, if x € N,g € G, then gxg~! € N. It’s also good to recall the commutator gxg~tx! forx E N,g €G. 


This is also in N, and intuitively this gives a lot of elements if G Is “very noncommutative.” 


Proposition 298 


SO3, the group of orthogonal matrices with determinant 1, is simple. 


Proof. It is easier to look at SUz. There exists a spin homomorphism SU2 — SO3, which tells us how a given element 


of SU> acts on the equator. The kernel is +/ (since those are the only elements of SU> that fix the equator when we 
conjugate by them). 


Take a normal subgroup of SO3. Then this maps to a subgroup of SU>, so the analogous statement for SU> is 


that the only proper normal subgroup of SU» is {+/}. 


To prove this, let’s say N is normal and contains some Q 4 +/. We'll show that N = SU>. The entire conjugacy 
class of Q, which is the latitude of matrices that have the same trace as Q, is in N. This is a subset of N, and the 
set of commutators of the form PQP~!Q7!, where Q is given and P is arbitrary, is also some subset of N. But 
multiplication by Q7! is continuous, so this also contains some points arbitrarily close to the identity and therefore all 
the conjugacy classes of points near the identity. One way to understand this is to draw a little path starting from Q 
in the conjugacy class {PQP~'}. After moving to the other sphere, we get a little path starting from / that is in N, 
which gives lots of things that are small distances away from the identity. 

So now just color in the whole sphere: given a longitude of the form {cos @/ + sin@A} for some A on the equator, 


we know that small @ values are in N. But cos@/ + sin@A = e”®, since A? = —!/ for all matrices on the equator! So 


now we get the whole longitude by taking powers of e“° for some small @. 


The next result, similar to the one we've just proved, is that the only proper normal subgroup of SL» is {+/}. In 


fact, we can replace R with any field: 


Theorem 299 


Let F be a field with at least 4 elements. Then the only proper normal subgroup of SL2(F) is {J 


Modding out by {-£/} is called PSLo(F) for some reason. One thing to mention: finite fields have order p” for 
prime p, and if p = 2, then / = —/. But that’s not too important right now. For example, say F is a finite field of 
order p. Then SL5(F) has order (e—UKa"—a) = (q—1)(q)(q+1). For example, if |F| = 5,7,11, then |SLo(F)| has 
order 120, 336, 1320, and |PSL2(F)| has order 60, 168, 660. 


Proof of theorem. Let N be a normal subgroup of SL2 = SL2(F), and let’s say it contains some matrix A # +/. Our 


goal is to show N = SLyo. 


: t xX 
First of all, 
01 


s#+1 (sos 4s 1). The matrices with eigenvalues s,s~+ form a conjugacy class in SLo. 


1 0 
: } generate SL>, so it suffices to show that N contains them. Let s € F, where 
x 


Claim 300. Suppose a matrix Q has eigenvalues s,s~1, and s #4 +1. Then the two eigenvectors v, and v2 (for those 


eigenvalues) aren't multiples of each other (since s 4 s~*), so they form a basis. 
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Let B be the matrix with eigenvectors v1, V2. We can scale these eigenvectors by scalars so that B has determinant 


_,|- Then BPB™ sends yy 
s 


s 
1 (and is therefore in SL2). Now Be; = v; for our standard basis vectors e; let P = 


s 0 
to BPe, = sBe, = sy, and similarly BPB7!v2 = s~' vo. So BPB~! = Q, and therefore Q is conjugate to e i} 
s 


Claim 301. /f N contains some matrix with eigenvalues s,s~', then N = SLo. 


SSX 
0 st 
get the lower triangular version too (with an analogous argument), and now we can generate all elements of Slo, 
meaning that N = SLo. 


So now we're ready for the actual proof. Let’s say N contains some element A 4 +/. Choose a vector vy, that is 


s! 0 1 x 
This is because N contains both ‘ | and | so it contains their product, which is ; i . We can 
Ss 


not an eigenvector; then v2 = Av; is not a multiple of v;, so we have a basis (vy, v2). 
Choose P such that vj, v2 are eigenvectors with eigenvalues r,r~'. Then the commutator C = PAP~!A™! is an 
element of N, so 
Cva = PAP™1A7*vp = PAP "ty = 1} PAY = 1 Pw = 1? 


so C is an element of N with one eigenvalue r~*. (The other eigenvalue is therefore r?.) The only thing that remains 


is to show that there exists r whose square is not £1. There are at most 4 solutions in a field for r2 = +1, so if the 


field has more than 5 elements, this works. So we now just need to think about p = 5, but luckily that also works! 


See 18.702 for the continuation of this class. 


88 


